Netgate SG-1000 microFirewall

Author Topic: New 502 Bad Gateway  (Read 5865 times)

0 Members and 2 Guests are viewing this topic.

Offline jimp

  • Administrator
  • Hero Member
  • *****
  • Posts: 21113
  • Karma: +1381/-25
    • View Profile
Re: New 502 Bad Gateway
« Reply #75 on: October 12, 2017, 12:53:05 pm »
Code: [Select]
# /usr/bin/netstat -Ln
unix  193/0/128                        /var/run/php-fpm.socket
Code: [Select]
[...]
tcp4       0      0 127.0.0.1.8081         192.168.16.73.43834         0      0      0      0  65700  65700      1   2048      0      0 525600 525600    0.00    0.00    0.00    0.00    0.00 3797.36
[...]
fffff8000d6e7960 stream   1116      0                0 fffff8000d617a50                0                0 /var/run/php-fpm.socket
fffff8000d617a50 stream      0      0                0 fffff8000d6e7960                0                0
[...]

So there are ~190+ things stuck doing a PHP operation, and the same number of stuck connections hitting the dnsbl daemon. The only thing pfBlocker does with lighty is run /usr/local/www/pfblockerng/www/index.php

So something in that file is getting stuck and making those pile up.  Probably its file lock operation, maybe something isn't giving up a lock and everything else is stuck waiting.

Try editing /usr/local/www/pfblockerng/www/index.php and commenting out or removing the whole "Increment DNSBL Alias Counter" block and see if it makes a difference. Keep a backup so you can put it back later if there is no change.

Someone should probably bring this to bbcan's attention in the meantime.
« Last Edit: October 12, 2017, 01:08:48 pm by jimp »
Need help fast? Commercial Support!

Co-Author of pfSense: The Definitive Guide. - Check the Doc Wiki for FAQs.

Do not PM for help!

Offline BreeOge

  • Jr. Member
  • **
  • Posts: 61
  • Karma: +6/-0
    • View Profile
Re: New 502 Bad Gateway
« Reply #76 on: October 12, 2017, 01:04:38 pm »
Done, let ya know if it crashes again.

Thank you for helping resolve this issue.  Me and many of the people here thank you for your time on this.

Offline MaxPF

  • Full Member
  • ***
  • Posts: 256
  • Karma: +1/-0
    • View Profile
Re: New 502 Bad Gateway
« Reply #77 on: October 12, 2017, 03:27:31 pm »
For anyone still seeing the problem after updating to 2.4.0-RELEASE, please gather the info I asked for in https://forum.pfsense.org/index.php?topic=137103.msg753994#msg753994 before rebooting the firewall and also supply a full list of installed packages that are running.

pfBlocker is mentioned a lot, but at least in the output shown so far, squid+clamav appeared to be more likely at fault.

The problem with that, at least in my case, is that the console is unusable either locally or remotely. There is no menu, just a black screen and no matter what command I try nothing happens. Even CTRL+C just show ^C.

Running 2.4 Release now. See how it goes.

Offline jimp

  • Administrator
  • Hero Member
  • *****
  • Posts: 21113
  • Karma: +1381/-25
    • View Profile
Re: New 502 Bad Gateway
« Reply #78 on: October 12, 2017, 03:32:13 pm »
The problem with that, at least in my case, is that the console is unusable either locally or remotely. There is no menu, just a black screen and no matter what command I try nothing happens.

Try Ctrl-Z and then run /bin/tcsh
Need help fast? Commercial Support!

Co-Author of pfSense: The Definitive Guide. - Check the Doc Wiki for FAQs.

Do not PM for help!

Offline john_galt

  • Jr. Member
  • **
  • Posts: 38
  • Karma: +2/-0
    • View Profile
Re: New 502 Bad Gateway
« Reply #79 on: October 12, 2017, 04:07:23 pm »
BreeOge,

Could you post the edits you made to the index.php file please? I found it but am unsure what to comment out.

Also the changed I made to the tunable kern.ipc.soacceptqueue did not stop the crash.


Doug
Doug

Offline jimp

  • Administrator
  • Hero Member
  • *****
  • Posts: 21113
  • Karma: +1381/-25
    • View Profile
Re: New 502 Bad Gateway
« Reply #80 on: October 12, 2017, 04:32:59 pm »
Also the changed I made to the tunable kern.ipc.soacceptqueue did not stop the crash.

Knowing what we know now, that is not surprising. The kern.ipc.soacceptqueue tunable is only for TCP, and this is a unix socket queue overflowing. There isn't a tunable for that, IIRC it's set by whatever sets up the socket (php-fpm in this case). But increasing that wouldn't solve the problem, only hide it longer.
Need help fast? Commercial Support!

Co-Author of pfSense: The Definitive Guide. - Check the Doc Wiki for FAQs.

Do not PM for help!

Offline BreeOge

  • Jr. Member
  • **
  • Posts: 61
  • Karma: +6/-0
    • View Profile
Re: New 502 Bad Gateway
« Reply #81 on: October 12, 2017, 04:33:41 pm »
BreeOge,

Could you post the edits you made to the index.php file please? I found it but am unsure what to comment out.

Also the changed I made to the tunable kern.ipc.soacceptqueue did not stop the crash.


Doug

The file is at this location

/usr/local/www/pfblockerng/www/index.php

Code: [Select]
cd /usr/local/www/pfblockerng/www/
Code: [Select]
cp index.php index.old = do this so you have a copy of the original before you remove the section.
Edit index.php with your favorite editor, and remove this section at the bottom.

Code: [Select]
if (!empty($pfb_query)) {
// Increment DNSBL Alias Counter
$dnsbl_info = '/var/db/pfblockerng/dnsbl_info';
if (($handle = @fopen("{$dnsbl_info}", 'r')) !== FALSE) {
flock($handle, LOCK_EX);
$pfb_output = @fopen("{$dnsbl_info}.bk", 'w');
flock($pfb_output, LOCK_EX);
// Find line with corresponding DNSBL Aliasname
while (($line = @fgetcsv($handle)) !== FALSE) {
if ($line[0] == $pfb_query) {
$line[3] += 1;
}
@fputcsv($pfb_output, $line);
}
@fclose($pfb_output);
@fclose($handle);
@rename("{$dnsbl_info}.bk", "{$dnsbl_info}");
}
}
« Last Edit: October 12, 2017, 04:54:42 pm by BreeOge »

Offline john_galt

  • Jr. Member
  • **
  • Posts: 38
  • Karma: +2/-0
    • View Profile
Re: New 502 Bad Gateway
« Reply #82 on: October 12, 2017, 04:56:22 pm »
Thanks BreeOge
Doug

Offline crisdavid

  • Jr. Member
  • **
  • Posts: 84
  • Karma: +12/-1
  • pfSense is the future of Networking.
    • View Profile
Re: New 502 Bad Gateway
« Reply #83 on: October 13, 2017, 08:01:24 am »
I have one box using the ZFS file structure, the other is using UFS, both using pfBlockerNG.  The ZFS is rock solid, and the UFS one gets the Bad Gateway after some time.  Wondering if that is a possible reason why two similar boxes with similar settings exhibit different behavior using the same snapshot and same packages.

Both running 20171009 Snapshots for 2.4.0

Just a thought

It would seem ZFS and pfBlockerNG play more nicely than UFS filesystem; before the jump to 2.4.0 Release. Reinstalled under ZFS and uploaded my configuration file from just before I performed the reinstalled and it's been running solid on both of my boxes that were affected. Normally it would last 20 minutes before I got the gateway error but not an error in the logs in sight. Previously I saw the line that stated "Listen queue overflow: 193 already in queue awaiting acceptance (x occurrences)".
Both of My pfSense boxes:
Dell OptiPlex 7010 SFF
OS: pfSense 2.4
CPU: I5-3570 3.4 GHz
RAM: 4GB
NIC: Intel EXPI9402PT Pro
Hard drive: 500GB

Offline AhnHEL

  • Hero Member
  • *****
  • Posts: 632
  • Karma: +18/-0
  • It is what it is.
    • View Profile
Re: New 502 Bad Gateway
« Reply #84 on: October 13, 2017, 09:35:15 am »
Just reinstalled myself changing from UFS to ZFS filesystem, using the same 20171009 snapshot.  Wouldn't last ten minutes before, but has been up without error for 24 hours now.

Never used Squid or ClamAV.  Only using pfBlockerNG.
AhnHEL (Angel)
NYC

2 pfSense sites: 2.4 (amd64)
Dell 755 SFF E6550 @ 2.3Ghz, 4GB RAM, 100/30 Mbps, Intel X3959
Dell 7010 SFF i5-3570 @ 3.4Ghz, 8GB RAM, 940/880 Mbps, Intel X3959
OpenVPN (Peer to Peer, Road Warrior), pfBlockerNG, Gaming


Offline john_galt

  • Jr. Member
  • **
  • Posts: 38
  • Karma: +2/-0
    • View Profile
Re: New 502 Bad Gateway
« Reply #85 on: October 13, 2017, 09:52:49 am »
Thanks for that report AhnHEL. I plan on doing the same thing tomorrow morning.


Doug
Doug

Offline BreeOge

  • Jr. Member
  • **
  • Posts: 61
  • Karma: +6/-0
    • View Profile
Re: New 502 Bad Gateway
« Reply #86 on: October 13, 2017, 10:43:25 am »
Quote

So there are ~190+ things stuck doing a PHP operation, and the same number of stuck connections hitting the dnsbl daemon. The only thing pfBlocker does with lighty is run /usr/local/www/pfblockerng/www/index.php

So something in that file is getting stuck and making those pile up.  Probably its file lock operation, maybe something isn't giving up a lock and everything else is stuck waiting.

Try editing /usr/local/www/pfblockerng/www/index.php and commenting out or removing the whole "Increment DNSBL Alias Counter" block and see if it makes a difference. Keep a backup so you can put it back later if there is no change.


So far, I have not had a crash since I removed that section.  Been 21 hours, and still running strong. Looks like jimp found the issue.  Now the question is what does it effect and why is it in there.

If it works good on ZFS and not UFS, this also makes some sense, as the error didn't show up on 2.4.0 till it was updated to BSD 11.1 from 11.0.   So something must have changed in the file system workings, and UFS doesn't like the file locking now that pfBlockerNG uses.
« Last Edit: October 13, 2017, 10:49:01 am by BreeOge »

Offline luckman212

  • Hero Member
  • *****
  • Posts: 716
  • Karma: +55/-0
    • View Profile
    • @luckman212 - github
Re: New 502 Bad Gateway
« Reply #87 on: October 13, 2017, 10:47:32 am »
Clicked on the link to read the latest post and got this...



joke? Lol. Happy Friday!

Offline jimp

  • Administrator
  • Hero Member
  • *****
  • Posts: 21113
  • Karma: +1381/-25
    • View Profile
Re: New 502 Bad Gateway
« Reply #88 on: October 13, 2017, 10:55:22 am »
So far, I have not had a crash since I removed that section.  Been 21 hours, and still running strong. Looks like jimp found the issue.  Now the question is what does it effect and why is it in there.

If it works good on ZFS and not UFS, this also makes some sense, as the error didn't show up on 2.4.0 till it was updated to BSD 11.1 from 11.0.   So something must have changed in the file system workings, and UFS doesn't like the file locking now that pfBlockerNG uses.

That's possible. It looks like it's trying to keep some stats about what was hit, but it's using a plain text file to do it. IMO that should be an sqlite database and not a plain text CSV file but not having looked at the rest of the related code I'm not sure what changing that would entail, or if anything absolutely relies on that being plain text. I've been told that bbcan is aware though and he's looking into it. That could also explain 502/php issues people have had in the past with the pfblocker widget.
Need help fast? Commercial Support!

Co-Author of pfSense: The Definitive Guide. - Check the Doc Wiki for FAQs.

Do not PM for help!

Offline BreeOge

  • Jr. Member
  • **
  • Posts: 61
  • Karma: +6/-0
    • View Profile
Re: New 502 Bad Gateway
« Reply #89 on: October 13, 2017, 10:58:04 am »
I have a PM going with BBcan, I pointed him to our findings a few min ago..

I am just glad we seem to be narrowing this down.