Welcome, Guest. Please login or register.
Did you miss your activation email?
+  pfSense Forum
|-+  Retired» 1.2.1-RC Snapshot Feedback and Problems-RETIRED» WAN interfaces flapping with multiWAN
Username:
Password:
 
 

Pages: 1 2 [3]   Go Down
  Print  
Author Topic: WAN interfaces flapping with multiWAN  (Read 4443 times)
0 Members and 1 Guest are viewing this topic.
familyguy
Full Member
***
Offline Offline

Posts: 76


View Profile
« Reply #30 on: October 09, 2008, 06:00:47 pm »

Can you all see at what hz are you running?
it should come out of sysctl kern.hz if greater than 1000 try setting it to 500 and retry.
Interesting would be hz 2000 but we will see.

Huh?  I don't understand what you just said.  What are you suggesting we change and why?

Best,
Logged
cmb
Administrator
Hero Member
*****
Online Online

Posts: I am a geek!!


View Profile
« Reply #31 on: October 12, 2008, 10:28:26 pm »

I'm sure they're all at default hz. Changing that isn't a solution regardless.

We'll get a resolution to this eventually, if it's an immediate problem for you, you'll have to downgrade to 1.2. This isn't going to be easy or quick to resolve.
Logged

pfSense Commercial Support

Paying customers receive support priority and as in depth of assistance as desired through the official commercial support channels at portal.pfsense.org. Forum users receive as much help as time permits.
familyguy
Full Member
***
Offline Offline

Posts: 76


View Profile
« Reply #32 on: October 13, 2008, 10:40:54 pm »

I'm sure they're all at default hz. Changing that isn't a solution regardless.

We'll get a resolution to this eventually, if it's an immediate problem for you, you'll have to downgrade to 1.2. This isn't going to be easy or quick to resolve.

OK.  I think downgrading looks like the path of least resistance.  The complaining from folks with frequently dropped connections at the office is getting rather shrill.  Looking forward to an eventual fix.

Best,
Logged
cheesyboofs
Sr. Member
****
Offline Offline

Posts: 288


If the above image is missing, pfSense is broken!


View Profile
« Reply #33 on: October 15, 2008, 03:50:12 am »

For what its worth I was seeing this too and have also downgraded to 1.2-Release. Its a shame, as I hate going backwards. You need a firewall to be reliable and stable and its hard to test a new beta without putting it in 'service'.
Logged

The needs of the many out way the needs of the few or the one …

Click 4 Example Setup
ermal
Administrator
Hero Member
*****
Offline Offline

Posts: 1117


View Profile
« Reply #34 on: October 19, 2008, 03:44:40 am »

The latest snapshots have a fix for this can you, if possible, test and report if it behaves correctly now.
Logged
familyguy
Full Member
***
Offline Offline

Posts: 76


View Profile
« Reply #35 on: October 19, 2008, 08:03:51 pm »

The latest snapshots have a fix for this can you, if possible, test and report if it behaves correctly now.

I'll give it a try next time I'm on site.  What was the nature of the fix?

Best,
Logged
cmb
Administrator
Hero Member
*****
Online Online

Posts: I am a geek!!


View Profile
« Reply #36 on: October 20, 2008, 09:17:48 pm »

slbd used to use fping to determine if a WAN was online. There is some kernel change in FreeBSD 7.0 that causes problems because fping sees replies from pings initiated by other processes.  Usually RRD for quality graph and slbd for monitor IP are both pinging the gateway IPs on your WAN (the fact that two processes are pinging the same thing is something we're eliminating in 1.3, but is too significant a change to pull into a maintenance release).

Now, slbd runs a shell script (for easy changing and testing, because the process being run is hard coded into the slbd binary) which runs FreeBSD's ping. It knows which replies are supposed to go where, and should behave properly unlike fping. The ping in FreeBSD 7.0 supports everything we were doing with fping. This should hopefully be resolved now.
Logged

pfSense Commercial Support

Paying customers receive support priority and as in depth of assistance as desired through the official commercial support channels at portal.pfsense.org. Forum users receive as much help as time permits.
NickC
Jr. Member
**
Offline Offline

Posts: 31


View Profile
« Reply #37 on: October 21, 2008, 05:52:27 am »

Confirm flapping stopped. Thanks for the fix.

Nick.
Logged
ermal
Administrator
Hero Member
*****
Offline Offline

Posts: 1117


View Profile
« Reply #38 on: October 21, 2008, 07:26:15 am »

Can you test that it behaves propperly if you disconnect one of the wans even in failover or loadbalance?
This would help pushing the 1.2.1 release.
Logged
NickC
Jr. Member
**
Offline Offline

Posts: 31


View Profile
« Reply #39 on: October 21, 2008, 12:42:33 pm »

I'm running multiple failover (not balance) multi-WAN on a CARP cluster.
Watching syslog messages as they come through I unplugged the phone line so the ping would fail but leave interfaces up.

It took 30s for a the message to come through:
"ICMP poll failed...marking service DOWN"

Plugged back in and "marking service as UP"

I don't know how long it took before but I think it was a little more responsive than this. If you think I'm just seeing a delay in the syslog pathway I can time it a more carefully using the logs.

Nick.
Logged
ermal
Administrator
Hero Member
*****
Offline Offline

Posts: 1117


View Profile
« Reply #40 on: October 21, 2008, 01:44:48 pm »

Nothing has changed in that way apart the good thing that it is working now.
If you wish /usr/local/sbin/slbd.sh has the command to check the status. As far as i am conerned you may replace it with anything you please just return the status.

Maybe worth confirming that it is the syslog latency though 30sec are not that bad too Tongue.
Logged
cwadge
Newbie
*
Offline Offline

Posts: 24



View Profile WWW
« Reply #41 on: October 21, 2008, 07:20:59 pm »

I'm discovering that on the latest nightly of 1.2.1rc1, the interface does get marked down after a few seconds. But unfortunately, the failover never actually takes places. If the interface goes down, it's simply down until you bring it back up. The downed network never falls back to the other gateway.
Logged
cwadge
Newbie
*
Offline Offline

Posts: 24



View Profile WWW
« Reply #42 on: October 21, 2008, 07:50:37 pm »

I'm discovering that on the latest nightly of 1.2.1rc1, the interface does get marked down after a few seconds. But unfortunately, the failover never actually takes places. If the interface goes down, it's simply down until you bring it back up. The downed network never falls back to the other gateway.
Ahhh, please disregard this. After some further troubleshooting, it turned out that the core router I was using to gauge uptime for that interface could be reached through either interface.

However, that does bring up two interesting points.
  • It took about 1 minute for the connection to completely fail over. The hardware is a dual 600MHz P3 with 384mb RAM, but it seems to be the loadbalancer that's taking so long to recognize that it's down, not the filter reload afterwords.
  • In this situation, I can only assume (until I dig a little deeper) that the PFSense box automatically started pinging the other gateway's core router through the gateway that was up. I'm not sure if this is considered a bug or just an undesirable feature... but ultimately, what's the difference?
« Last Edit: October 21, 2008, 08:24:39 pm by cwadge » Logged
Pages: 1 2 [3]   Go Up
  Print  
 
Jump to:  

 

Page created in 0.178 seconds with 20 queries.