Netgate SG-1000 microFirewall

Author Topic: 3rd interface not failing back...  (Read 28815 times)

0 Members and 1 Guest are viewing this topic.

Offline jakehathaway

  • Newbie
  • *
  • Posts: 18
  • Karma: +0/-0
    • View Profile
3rd interface not failing back...
« on: May 21, 2007, 10:37:11 am »
I have a QMOE connection between 2 data centers, location A and B. Each d/c is running a pf setup of 2 machines, PF1 and PF2. Each machine has 4 int, 1-WAN, 2-LAN, 3-QMOE, 4-PFSync.
Most traffic travels from A to B, not the other way. At some point the interfaces on location A failover from PF1 to PF2. Things seem fine. But when they failback to PF1, int-3 doesn't seem to failback, PF1 shows backup, PF2 shows master. But all traffic appears to be traveling to PF1 but the routes are not active (since it is in backup).
I have added a screen shot to show how PF1 looks when it is not working.
Thanks for any info.
Jake
« Last Edit: May 21, 2007, 11:26:21 am by jakehathaway »

Offline sullrich

  • Hero Member
  • *****
  • Posts: 5110
  • Karma: +7/-2348
    • View Profile
    • pfSense
Re: 3rd interface not failing back...
« Reply #1 on: May 21, 2007, 11:49:21 am »
Check the switch.  It is not passing multicast correctly.

Also search the forum.  This exact question has been asked (and answered) around 20 times.

Offline jakehathaway

  • Newbie
  • *
  • Posts: 18
  • Karma: +0/-0
    • View Profile
Re: 3rd interface not failing back...
« Reply #2 on: May 21, 2007, 03:03:35 pm »
We are looking into the multicasting. Sorry for duplicate post, but I searched for an hour looking for something similar and didn't find it.
If you know of one of them please link to it here, thanks.

Jake

Offline Juve

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 982
  • Karma: +22/-0
  • --=(BSD)=--
    • View Profile
Re: 3rd interface not failing back...
« Reply #3 on: May 21, 2007, 03:18:44 pm »
log on with SSH or directly from console (choose 8 ).
On the box not failling back, do a tcpdump like this:
tcpdump -i <ifname> -ttt -n proto CARP
where <ifname> is the name of the physical interface.

Do the same on the second box.

When all is working fine, you should see a trace showing multicast traffic directed to 224.0.0.18 (vrrp v2 multicast address), sourced from the IP address of the physical interface of the master node. On the slave node, you should see these packets too. When powering off  the master node the packets should then be sourced from the slave node with a higher advskew.


The four main problems you should encounter:

1) Misconfiguration: password, VHID or advskew problems, check it again.

2) Another device using VRRPv2 is using a VHID you are using, check you network devices or change VHID

3) You don't see master's packets on the slave node when doing the tcpdump (so the slave node has one or more interface in master mode). You have a communication error between the two machines. Check the switchs, the cables. Or look at problem 4 ;-)

4) You have a NAT rule, natting everything from a source network to a single IP address which IS NOT the interface address and which is in ANOTHER subnet. Should happen on WAN iface most of the time.








« Last Edit: May 21, 2007, 03:20:36 pm by Juve »

Offline sullrich

  • Hero Member
  • *****
  • Posts: 5110
  • Karma: +7/-2348
    • View Profile
    • pfSense
Re: 3rd interface not failing back...
« Reply #4 on: May 21, 2007, 03:27:12 pm »
Stickying thread.

Offline jakehathaway

  • Newbie
  • *
  • Posts: 18
  • Karma: +0/-0
    • View Profile
Re: 3rd interface not failing back...
« Reply #5 on: May 22, 2007, 10:43:25 am »
tcpdump -i xl0 -ttt -n proto CARP
Here is the output of my tcpdump:
709630 IP 172.16.20.152 > 224.0.0.18: VRRPv2, Advertisement, vrid 6, prio 200, authtype none, intvl 1s, length 36
293069 IP 172.16.20.251 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 0, authtype none, intvl 1s, length 36
1. 002309 IP 172.16.20.251 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 0, authtype none, intvl 1s, length 36
487570 IP 172.16.20.152 > 224.0.0.18: VRRPv2, Advertisement, vrid 6, prio 200, authtype none, intvl 1s, length 36
514636 IP 172.16.20.251 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 0, authtype none, intvl 1s, length 36
1. 001317 IP 172.16.20.251 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 0, authtype none, intvl 1s, length 36
267018 IP 172.16.20.152 > 224.0.0.18: VRRPv2, Advertisement, vrid 6, prio 200, authtype none, intvl 1s, length 36
734179 IP 172.16.20.251 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 0, authtype none, intvl 1s, length 36
1. 001057 IP 172.16.20.251 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 0, authtype none, intvl 1s, length 36
047719 IP 172.16.20.152 > 224.0.0.18: VRRPv2, Advertisement, vrid 6, prio 200, authtype none, intvl 1s, length 36
953636 IP 172.16.20.251 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 0, authtype none, intvl 1s, length 36
829337 IP 172.16.20.152 > 224.0.0.18: VRRPv2, Advertisement, vrid 6, prio 200, authtype none, intvl 1s, length 36
171683 IP 172.16.20.251 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 0, authtype none, intvl 1s, length 36
1. 001111 IP 172.16.20.251 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 0, authtype none, intvl 1s, length 36
610157 IP 172.16.20.152 > 224.0.0.18: VRRPv2, Advertisement, vrid 6, prio 200, authtype none, intvl 1s, length 36
391038 IP 172.16.20.251 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 0, authtype none, intvl 1s, length 36
1. 234670 IP 172.16.20.251 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 0, authtype none, intvl 1s, length 36
157247 IP 172.16.20.152 > 224.0.0.18: VRRPv2, Advertisement, vrid 6, prio 200, authtype none, intvl 1s, length 36
1. 039601 IP 172.16.20.251 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 0, authtype none, intvl 1s, length 36

the 151 is the master machine, the 251 is the machine on the other side of the QMOE link that is the other firewall PFsense box. you can see the vrid is different, so that shouldn't affect it.

1) Misconfiguration: password, VHID or advskew problems, check it again.

Checked this, it is correct.

2) Another device using VRRPv2 is using a VHID you are using, check you network devices or change VHID

Obviously it is connected to the pfsense on the other side of the qmoe, but not sure if vrid is same as vhid, but I manually checked in the gui for the config of both sides of qmoe and the vhid is different.

3) You don't see master's packets on the slave node when doing the tcpdump (so the slave node has one or more interface in master mode). You have a communication error between the two machines. Check the switchs, the cables. Or look at problem 4 ;-)

I see the master packets, see about tcpdump.

4) You have a NAT rule, natting everything from a source network to a single IP address which IS NOT the interface address and which is in ANOTHER subnet. Should happen on WAN iface most of the time.

Still checking this. But not sure what that would affect. Will post follow-up in a bit.

thx for the help with this.
« Last Edit: May 22, 2007, 11:00:51 am by jakehathaway »

Offline jakehathaway

  • Newbie
  • *
  • Posts: 18
  • Karma: +0/-0
    • View Profile
Re: 3rd interface not failing back...
« Reply #6 on: May 22, 2007, 11:18:07 am »
As far as NAT routing to single IP, we do not have that on the network that is having trouble.

As you can see in the image the last rule is for the qmoe and it goes to * (all).

Offline Juve

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 982
  • Karma: +22/-0
  • --=(BSD)=--
    • View Profile
Re: 3rd interface not failing back...
« Reply #7 on: May 22, 2007, 01:01:56 pm »
Can you give us a network diagram ? You have 4 machines as I can understand, 2 on A d/c and 2 on B d/c
« Last Edit: May 22, 2007, 01:07:46 pm by Juve »

Offline jakehathaway

  • Newbie
  • *
  • Posts: 18
  • Karma: +0/-0
    • View Profile
Re: 3rd interface not failing back...
« Reply #8 on: May 22, 2007, 03:01:35 pm »
Here is a simple drawing. The pf2 box, interface 4 (QMOE) is the only one that doesn't failback.

Offline sullrich

  • Hero Member
  • *****
  • Posts: 5110
  • Karma: +7/-2348
    • View Profile
    • pfSense
Re: 3rd interface not failing back...
« Reply #9 on: May 22, 2007, 08:09:20 pm »
Are all of the nics the same type?

Offline jakehathaway

  • Newbie
  • *
  • Posts: 18
  • Karma: +0/-0
    • View Profile
Re: 3rd interface not failing back...
« Reply #10 on: May 23, 2007, 04:04:11 pm »
NIC types...
pf1 and pf2:
int 1 - Intel Pro 100 - WAN
int 2 - Intel Pro 100 - LAN
int 3 - Intel Pro 100 - pfsync
int 4 - 3com 3C905-TX - QMOE

pf251 and pf252:
int 1 - Intel e1000 - LAN
int 2 - Intel e1000 - WAN
int 3 - Broadcom Gbit - QMOE
int 4 - Broadcom Gbit - pfsync

Offline Juve

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 982
  • Karma: +22/-0
  • --=(BSD)=--
    • View Profile
Re: 3rd interface not failing back...
« Reply #11 on: May 24, 2007, 09:24:27 am »
Have you checked that either Foundry or HP equipment aren't filtering any type of trafic (like multicast)?

Offline jakehathaway

  • Newbie
  • *
  • Posts: 18
  • Karma: +0/-0
    • View Profile
Re: 3rd interface not failing back...
« Reply #12 on: May 24, 2007, 10:09:43 am »
yep, multicast is working just fine. The foundry side is working fine... pf251,pf252. It is the HP side that is having the failback problem. But we checked the multicast and it is fine. I can also see it in the tcpdump on pf2.

Offline jakehathaway

  • Newbie
  • *
  • Posts: 18
  • Karma: +0/-0
    • View Profile
Re: 3rd interface not failing back...
« Reply #13 on: May 31, 2007, 12:14:44 am »
Is there any other information you can give me? Anything else you might try? Please let me know.

Offline sullrich

  • Hero Member
  • *****
  • Posts: 5110
  • Karma: +7/-2348
    • View Profile
    • pfSense
Re: 3rd interface not failing back...
« Reply #14 on: May 31, 2007, 11:27:37 am »
Check network equipment on HP side.  Something is being blocked (multicast).