Netgate m1n1wall

Author Topic: 2.0 RC1 CPU at 100% after 1-4 days  (Read 5130 times)

0 Members and 1 Guest are viewing this topic.

Offline Coinbird

  • Newbie
  • *
  • Posts: 14
  • Karma: +0/-0
    • View Profile
2.0 RC1 CPU at 100% after 1-4 days
« on: March 23, 2011, 11:31:25 am »
(EDIT: seems to happen as soon as a few hours after booting)

2.0-RC1 (i386)
built on Sat Feb 26 15:30:26 EST 2011

Hello,
  I have a 32-bit pfSense 2.0 RC1 install on an (older) AMD Sempron, with two Trendnet Gigabit NIC's, running off a hard drive. Connection is 1.5 Mbit DSL, nothing spectacular. CPU usage is usually around 3-4%.
  For some reason, after 3-4 days, I'll notice my network acting slowly/strangely (but still working,) only to log onto the console, bring up top, and find it laden with a ton of inetd and nc processes maxing out the CPU. RRD Graphs seem to go blank at the time where the CPU goes nuts. I'm running the SMP kernel, and I tried the other kernel during my initial troubleshooting of the problem (which I thought was because I was running off a USB stick, thus switched over to a hard drive.)
  During this time the box is barely useable so the last time this happened (this morning) I just snapped a picture of the screen. Other than that, the only "weird" thing I see after I reboot are 6  "Bump sched buckets to 64 (was 0)" messages.
I've disabled pretty much everything in BIOS.

vmstat -i
interrupt                          total       rate
irq0: clk                        2154812        999
irq1: atkbd0                          18          0
irq8: rtc                         275810        127
irq10: re0 uhci0+                 218310        101
irq11: re1 atapci0*               219425        101
irq14: ata0                         5462          2
Total                            2873837       1333

  Not sure what to check next, so I wanted to get some advice from the forum, e.g. what I should be checking for in the logs. Anything strange stand out? I couldn't find any open issues for 2.0 that cover this, but perhaps I wasn't searching for the right thing. (considered posting in Hardware but wasn't sure because this was specific to 2.0) Thanks for looking!

Attached: image of top during strange behavior
« Last Edit: March 24, 2011, 04:48:53 pm by Coinbird »

Offline NiteSnow

  • Newbie
  • *
  • Posts: 8
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 3-4 days
« Reply #1 on: March 23, 2011, 04:04:28 pm »
Try updating pfSense, you're using an older build, pfSense 2.0-RC1 is constantly updated sometimes more than once a day. I have an amd athlon k7 1.3Ghz with 512MB of ram from 2001 and I've never seen this problem and I've been using 2.0RC1 for just under a month and a half.

Offline Coinbird

  • Newbie
  • *
  • Posts: 14
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 3-4 days
« Reply #2 on: March 23, 2011, 04:12:39 pm »
I'll try updating and keep an eye on it. Since this is newer than 1.5 months, I can't help but suspect some hardware configuration issue.
(I'd feel more confident about it if this was a known issue before that was resolved; the closest thing I found was a bug related to rate service (for traffic graphs) causing high CPU usage after a few days.)
« Last Edit: March 23, 2011, 04:15:28 pm by Coinbird »

Offline wallabybob

  • Hero Member
  • *****
  • Posts: 5262
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 3-4 days
« Reply #3 on: March 23, 2011, 07:11:43 pm »
Top reports
Quote
674 processes, 262 running, 377 zombie

That you have so many inetd processes and zombie processes suggests there might be an issue with the interaction between inetd and a child process.

On my system
Quote
# more /var/etc/inetd.conf
tftp-proxy      dgram   udp     wait            root    /usr/libexec/tftp-proxy tftp-proxy -v
#
suggesting tftp is the only thing inetd is likely to start.

What uses inetd on your system? The pfSense shell command
Quote
# clog /var/log/system.log | grep inetd
MIGHT provide some hints.

Offline Coinbird

  • Newbie
  • *
  • Posts: 14
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 1-4 days
« Reply #4 on: March 24, 2011, 11:26:28 am »
Thanks for the suggestions.
It happened again overnight, so only about 8 hours of uptime before it happened again. Another thing of note is that there are a ton of "nc" (netcat) processes along with all the zombie inetd processes.
I grepped the system log for inetd and didn't see any messages containing it (prior to rebooting.) Unfortunately I didn't realize the system.log was entirely wiped during a reboot, so I'll make sure to scp it over beforehand when it happens again.

The only "non-standard" packages I have running are snort and openVPN.Something weird I noticed in the system.log is that each of the snort log entries is duplicated, such as
Code: [Select]
Mar 24 09:10:04 snort[52667]: --== Initialization Complete ==--
Mar 24 09:10:04 snort[52667]: --== Initialization Complete ==--

I disabled snort from the webconfigurator and the system didn't recover, but I have disabled it for the time being to assist with the troubleshooting.
When (if) it happens again, I'll make sure I get the system.log to try and correlate and messages with when the system freaks out.



Offline wallabybob

  • Hero Member
  • *****
  • Posts: 5262
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 3-4 days
« Reply #5 on: March 24, 2011, 03:50:23 pm »
What is in your /var/etc/inetd.conf?

Do you have anything attempting to use tftp or any other service in /var/etc/inetd.conf?


Offline Coinbird

  • Newbie
  • *
  • Posts: 14
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 3-4 days
« Reply #6 on: March 24, 2011, 03:55:25 pm »
Looks like tftp and the firewall rules (first three digits edited to xxx) for a few of my external IP's
Code: [Select]
tftp-proxy dgram udp wait root /usr/libexec/tftp-proxy tftp-proxy -v
19000 stream tcp nowait/0 nobody /usr/bin/nc nc -w 2000   10.0.0.26  xxx.xxx.xxx.86 25
19001 stream tcp nowait/0 nobody /usr/bin/nc nc -w 2000   10.0.0.26  xxx.xxx.xxx.86 25
19002 stream tcp nowait/0 nobody /usr/bin/nc nc -w 2000   10.0.0.26  xxx.xxx.xxx.86 25
19003 stream tcp nowait/0 nobody /usr/bin/nc nc -w 2000   10.0.0.26  xxx.xxx.xxx.86 25
19004 stream tcp nowait/0 nobody /usr/bin/nc nc -w 2000   10.0.0.4  xxx.xxx.xxx.85 443
19005 stream tcp nowait/0 nobody /usr/bin/nc nc -w 2000   10.0.0.4  xxx.xxx.xxx.85 222
19006 stream tcp nowait/0 nobody /usr/bin/nc nc -w 2000   10.0.0.26  xxx.xxx.xxx.86 222
19007 stream tcp nowait/0 nobody /usr/bin/nc nc -w 2000   10.0.0.4  xxx.xxx.xxx.85 54
19007 dgram udp nowait/0 nobody /usr/bin/nc nc -u -w 2000   10.0.0.4  xxx.xxx.xxx.85 54
19008 stream tcp nowait/0 nobody /usr/bin/nc nc -w 2000   10.0.0.4  xxx.xxx.xxx.85 54
19008 dgram udp nowait/0 nobody /usr/bin/nc nc -u -w 2000   10.0.0.4  xxx.xxx.xxx.85 54
19009 stream tcp nowait/0 nobody /usr/bin/nc nc -w 2000   10.0.0.4  xxx.xxx.xxx.85 54
19009 dgram udp nowait/0 nobody /usr/bin/nc nc -u -w 2000   10.0.0.4  xxx.xxx.xxx.85 54
19010 stream tcp nowait/0 nobody /usr/bin/nc nc -w 2000   10.0.0.115  xxx.xxx.xxx.91 80

Offline wallabybob

  • Hero Member
  • *****
  • Posts: 5262
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 3-4 days
« Reply #7 on: March 24, 2011, 04:50:33 pm »
On my pfSense:
Quote
# /usr/bin/nc -w 2000 10.0.0.26 205.126.89.86 25
nc: port number invalid: 205.126.89.86
#

Also when I tried to match up the nc command in inetd.conf against the FreeBSD man page for nc it seemed to me that the command didn't match the template in the man page.

I'm running 2.0-RC1-IPv6 (i386)
built on Sun Mar 20 02:20:38 EDT 2011

« Last Edit: March 24, 2011, 04:57:26 pm by wallabybob »

Offline Coinbird

  • Newbie
  • *
  • Posts: 14
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 1-4 days
« Reply #8 on: March 24, 2011, 05:48:39 pm »
Thanks for investigating; I'll go ahead with the upgrade tonight and see if that changes anything. I haven't updated it yet, so hopefully it'll go smoothly.

Offline wallabybob

  • Hero Member
  • *****
  • Posts: 5262
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 1-4 days
« Reply #9 on: March 24, 2011, 06:08:09 pm »
Thanks for investigating; I'll go ahead with the upgrade tonight
Its probably a good thing to upgrade snapshot builds from time to time, espeially when you come across problems. I just want to be clear that I wasn't suggesting you upgrade. AFter the investigation I recently reported I fully expect the version I'm running would display similar symptoms to your system if I had a similar inetd.conf and I had traffic activating the nc entries in inetd.conf.

Do you have any idea what parts of your configuration are responsible for those nc entries in inetd.conf?


Offline Coinbird

  • Newbie
  • *
  • Posts: 14
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 1-4 days
« Reply #10 on: March 24, 2011, 06:11:48 pm »
Right, I figured I'd upgrade anyway, no huge hopes that it'll solve this issue, but perhaps the something with nc changed?

I haven't done anything exotic, just set up some NAT port forwarding via Web Configurator, which I assume was what added those nc lines to inetd.conf.

Offline Coinbird

  • Newbie
  • *
  • Posts: 14
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 1-4 days
« Reply #11 on: March 24, 2011, 07:40:06 pm »
Upgraded to
2.0-RC1 (i386)
built on Thu Mar 24 13:58:11 EDT 2011

and disabled Snort for the time being. Thanks for the input so far; if it happens again I'll be sure to copy down the logs for more info before rebooting it.

Offline wallabybob

  • Hero Member
  • *****
  • Posts: 5262
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 1-4 days
« Reply #12 on: March 24, 2011, 08:06:54 pm »
I haven't done anything exotic, just set up some NAT port forwarding via Web Configurator, which I assume was what added those nc lines to inetd.conf.
I have a number of port forward rules defined in Firewall -> NAT, click on Port Forward tab and I don't have any nc entries in /etc/inetd.conf. Do you have a different type of port forward?

Offline ermal

  • Administrator
  • Hero Member
  • *****
  • Posts: 3365
  • Karma: +3/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 1-4 days
« Reply #13 on: March 25, 2011, 04:01:35 am »
Can you tell me if you have any aliases referenced on port forward rules?

Offline Coinbird

  • Newbie
  • *
  • Posts: 14
  • Karma: +0/-0
    • View Profile
Re: 2.0 RC1 CPU at 100% after 1-4 days
« Reply #14 on: March 25, 2011, 12:17:02 pm »
Yes, I have aliases defined for most of my firewall rules. For some aliases, I specified both the internal and external IP's.

Under Firewall->Aliases, I have a few entries similar to
Name | Values
mailserver | 10.0.0.4, xxx.xxx.xxx.85


Then in Firewall->NAT I created rule(s) using the aliases, like:
WAN    TCP    *    *    xxx.xxx.xxx.85    25 (SMTP)    mailserver 25 (SMTP)    Mail Server
« Last Edit: March 25, 2011, 12:24:41 pm by Coinbird »