pfSense Gold Subscription

Author Topic: Fluactuating CPU USAGE  (Read 138 times)

0 Members and 1 Guest are viewing this topic.

Offline itdept.tamtech

  • Newbie
  • *
  • Posts: 10
  • Karma: +0/-1
    • View Profile
Fluactuating CPU USAGE
« on: November 08, 2017, 05:17:51 pm »
Guys we would like to ask if the current behavior of our pfsense is normal? We noticed this behavior since we upgraded to 2.3.4 and added 2 multiple intel pcie nic cards. The current situation is from 0%-1% cpu usage then it will fluctuate to 50% out of nowhere then we will momentarily loss an internet connection to our network, but the load average is normal, not seeing any high load.

Our current hardware specs is:
desktop i3
desktop mobo (H81m ASUS)
8gb RAM
500gb hdd

Current running services:
dhcp server, all 4 lan ports are being used for multiple ISP

Offline PiBa

  • Hero Member
  • *****
  • Posts: 731
  • Karma: +123/-1
  • PiBa-NL(on IRC)
    • View Profile
Re: Fluactuating CPU USAGE
« Reply #1 on: November 08, 2017, 05:35:37 pm »
Check logs.. and timestamps..

when connection is lost, pfSense will re-configure some things, adjust firewall-rules, restart services.. This could cause higher cpu usage after the connection is already lost.. but should settle after a little while.. then when connection comes back this will happen again..

Anyhow probably would check if any alarms are raised/cleared around the start of such issues in the Status/SystemLogs/System/Gateways log. And maybe raise the acceptable limits for the monitoring..

Offline itdept.tamtech

  • Newbie
  • *
  • Posts: 10
  • Karma: +0/-1
    • View Profile
Re: Fluactuating CPU USAGE
« Reply #2 on: November 08, 2017, 05:41:26 pm »
Hi PiBa thanks for the quick reply, what exactly in the logs would i need to check?

Offline itdept.tamtech

  • Newbie
  • *
  • Posts: 10
  • Karma: +0/-1
    • View Profile
Re: Fluactuating CPU USAGE
« Reply #3 on: November 08, 2017, 05:43:10 pm »
Here is a sample output log under gateways

Nov 9 07:26:30   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 21217us stddev 21236us loss 0%
Nov 9 07:25:29   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1076442us stddev 2598013us loss 0%
Nov 9 07:13:06   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 59222us stddev 110961us loss 0%
Nov 9 07:12:04   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1748077us stddev 3728413us loss 1%
Nov 9 06:51:12   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 22730us stddev 29675us loss 2%
Nov 9 06:50:12   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 680648us stddev 2801778us loss 17%
Nov 9 06:47:01   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 48774us stddev 135173us loss 1%
Nov 9 06:46:12   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1800458us stddev 4419049us loss 14%
Nov 9 06:46:03   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1224100us stddev 3900733us loss 18%
Nov 9 06:46:01   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 48888us stddev 55786us loss 21%
Nov 9 06:27:12   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 50635us stddev 84350us loss 0%
Nov 9 06:26:11   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1303325us stddev 3015131us loss 2%
Nov 9 06:23:03   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 29939us stddev 62610us loss 0%
Nov 9 06:22:02   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1060297us stddev 3166268us loss 11%
Nov 9 06:04:22   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 27453us stddev 26906us loss 0%
Nov 9 06:03:21   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1426396us stddev 3216820us loss 1%
Nov 9 05:53:33   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 423567us stddev 286985us loss 0%
Nov 9 05:53:06   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 504659us stddev 161576us loss 0%
Nov 9 05:44:46   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 378003us stddev 283877us loss 0%
Nov 9 05:42:58   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 504040us stddev 302687us loss 0%
Nov 9 05:39:24   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 259328us stddev 71765us loss 0%
Nov 9 05:38:23   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1273283us stddev 3211056us loss 6%
Nov 9 05:32:13   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 341071us stddev 441666us loss 0%
Nov 9 05:30:44   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 507608us stddev 364839us loss 0%
Nov 9 05:25:09   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 452379us stddev 395378us loss 0%
Nov 9 05:24:42   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 511638us stddev 227277us loss 0%
Nov 9 05:18:24   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 443379us stddev 395870us loss 0%
Nov 9 05:17:41   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 506608us stddev 485837us loss 0%
Nov 9 04:00:04   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 11498us stddev 10849us loss 0%
Nov 9 03:59:02   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1149667us stddev 2749643us loss 0%
Nov 9 03:42:29   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 13262us stddev 15224us loss 0%
Nov 9 03:41:28   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1113153us stddev 2676270us loss 0%
Nov 9 01:16:24   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 12611us stddev 7688us loss 0%
Nov 9 01:15:22   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1232198us stddev 2804925us loss 4%
Nov 9 01:05:02   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 14461us stddev 22155us loss 0%
Nov 9 01:04:01   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1513372us stddev 3374233us loss 0%
Nov 8 23:08:36   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 14295us stddev 7484us loss 0%
Nov 8 23:07:35   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1203103us stddev 2822637us loss 0%
Nov 8 22:36:01   dpinger      send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 192.168.8.1 bind_addr 192.168.8.100 identifier "OPT1_DHCP "
Nov 8 22:36:01   dpinger      send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 192.168.10.1 bind_addr 192.168.10.5 identifier "WAN_DHCP "
Nov 8 22:36:01   dpinger      send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 10.47.16.1 bind_addr 10.47.16.127 identifier "OPT2_DHCP "
Nov 8 22:35:56   dpinger      send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 192.168.8.1 bind_addr 192.168.8.100 identifier "OPT1_DHCP "
Nov 8 22:35:56   dpinger      send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 192.168.10.1 bind_addr 192.168.10.5 identifier "WAN_DHCP "
Nov 8 22:35:56   dpinger      send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 10.47.16.1 bind_addr 10.47.16.127 identifier "OPT2_DHCP "
Nov 8 22:35:33   dpinger      OPT1_DHCP 192.168.8.1: Alarm latency 420us stddev 58us loss 22%
Nov 8 21:29:48   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 14528us stddev 5071us loss 0%
Nov 8 21:28:47   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1308261us stddev 3007111us loss 0%
Nov 8 21:23:32   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 18009us stddev 33507us loss 0%
Nov 8 21:22:29   dpinger      OPT2_DHCP 10.47.16.1: Alarm latency 1191405us stddev 2817845us loss 0%
Nov 8 21:17:14   dpinger      OPT2_DHCP 10.47.16.1: Clear latency 14201us stddev 4630us loss 0%

Offline PiBa

  • Hero Member
  • *****
  • Posts: 731
  • Karma: +123/-1
  • PiBa-NL(on IRC)
    • View Profile
Re: Fluactuating CPU USAGE
« Reply #4 on: November 08, 2017, 06:02:33 pm »
So ping times to the monitored address go above 1 second from time to time.. thats pretty bad.. and above the limits.. so would explain pfSense restarting services/using cpu..

So then the question, are you downloading some large files or windowsupdates or torrents or something at that time? Or is it happening even when little traffic is moving along and perhaps the ISP itself is having troubles and buffer bloat.?.

If your using all available bandwidth at that time and results in this effect, it might be a good idea to configure some traffic shaping or limiters to keep speeds a few percent below the maximum allowed by the isp, that would make sure the isp does not need to throttle your packets.. Though that can be complicated to configure correctly, and there are quite a few options to choose from in this subject..

Offline itdept.tamtech

  • Newbie
  • *
  • Posts: 10
  • Karma: +0/-1
    • View Profile
Re: Fluactuating CPU USAGE
« Reply #5 on: November 08, 2017, 06:37:20 pm »
Hi PiBA

Our current nic setup is thi way:

WAN-LTE
OPT1-LTE
OPT2-CABLE BROADBAND(local provider from the are, deployed a coaxial cable modem)
OPT3-LTE

Based from the logs result, OPT2_DHCP is a cable modem provided by the local provider in our area, they deployed a coaxial cable instead of a cat5e copper and we are having problems since day 1 of usage. Random lost of internet connection, no definite time, sudden burst of high latency from 0ms then all of a sudden to 3500ms then 100% offline status in pfsense. We have reported this to the provider but all they can do  and say is tell us to restart the modem and that the signal is fine, no problem at all with rx tx.

Thanks

Offline PiBa

  • Hero Member
  • *****
  • Posts: 731
  • Karma: +123/-1
  • PiBa-NL(on IRC)
    • View Profile
Re: Fluactuating CPU USAGE
« Reply #6 on: November 08, 2017, 07:00:50 pm »
Maybe checkout what the status/monitor 'quality' graph will tell about this gateway (use the wrench top right to select quality for the opt2 gateway..) Maybe there is some 'patern' ? Also correlate it with traffic usage.. maybe windows is pulling in a update at that moment.?.

Also try and run tracert or mtr and see if indeed the problem lies between modem and isp.. try and replace cables or plug in a regular pc instead of pfSense and check if it has the same problems before rebooting the modem.

Is the modem a actual modem? Or is it a combination box thats also handing out IP addresses or is it perhaps bridged?

Offline itdept.tamtech

  • Newbie
  • *
  • Posts: 10
  • Karma: +0/-1
    • View Profile
Re: Fluactuating CPU USAGE
« Reply #7 on: November 08, 2017, 07:05:42 pm »
Hi PiBA,

As advice from the Providers Consultant (who says he is a CCNP Cert), the cable modem is a router/wifi/dhcp(pardon me for net being clear) to change it from router mode to bridge mode. Will do some mtr and tracert and post the results here.

Thanks


Offline BlueKobold

  • Hero Member
  • *****
  • Posts: 2449
  • Karma: +193/-104
  • pfSense rocks!
    • View Profile
Re: Fluactuating CPU USAGE
« Reply #8 on: November 09, 2017, 12:35:59 am »
Quote
to 2.3.4 and added 2 multiple intel pcie nic cards.
pfSense, is creating, pending on the used NICs and drivers and also the number of CPU cores queues!
If there are many NIC cards or cards with many ports are in that can be too much queues on one side
or this queues are running full or it will need too much if the will be filled, so you see this cation can be
going in both directions.

How many cards are really inside of your pfSense box? And how much port there are offering?

Quote
0%-1% cpu usage then it will fluctuate to 50% out of nowhere then we will momentarily loss an internet connection to our network, but the load average is normal, not seeing any high load.
Please read above about this behavior or from where it will be able to hit your network. Nothing
really hard is in usage but the queues need to be filled up and this is using to long time for doing so.

Quote
Our current hardware specs is:
desktop i3
desktop mobo (H81m ASUS)
8gb RAM
500gb hdd
Let us imagine that you have 4C/8T CPU and 4 ports in, and the driver for them are opening in the
worst case then 32 queues that must be filled to get a nice packet flow! So you see limiting that
queues to 1, 2 or 4 per each NIC or port will aint a better reaction time with no packet loss.

And the second thing is how much the size of that queues should be, this can really be different
from system to system and use case to use case too! So it can be good to adjust this mbuf size
to 125000, 250000, 500000, 750000 or 1000000 and this will be then acting together like
you take the time to fine tune that both points!!! It is even not only one point to screw on, it
is often more a game play for all invited parts.


Quote
..........then it will fluctuate to 50% out of nowhere then we will momentarily.......
Could good be that you should check also some other things please?
In the BIOS direct
- enables Intel Speed Step
- enables Intel TruboBoost 1/2.0
- enables Intel hyper threading (HT)

- enables PowerD (hi adaptive)
That let the CPU scaling up if needed and also down if nothing to do for it. Otherwise you
could be nailed to one MHz/GHz number.

Quote
Current running services:
dhcp server, all 4 lan ports are being used for multiple ISP
This is in normal peanuts or nothing important for such a system!

Quote
WAN-LTE
OPT1-LTE
OPT2-CABLE BROADBAND(local provider from the are, deployed a coaxial cable modem)
OPT3-LTE
If you find out now, that there are not all pure modems in front of your pfSense box at the WAN
ports, could it be that this are all are routers, also the LTE ones? Because if it so please have
a look at the WAN interface of pfSense and set up there a static IP addresses from the network
in front of them, as an example;
- Cable modem has 192.168.2.1/24 then you should take something out of the DHCP range of that
device let us say 192.168.2.250/24 and so this IP address at the WAN port could not be immediately
change around given from the DHCP in front of and all connections will be lost in time!

Quote
the cable modem is a router/wifi/dhcp(pardon me for net being clear) to change it from router mode to bridge mode. Will do some mtr and tracert and post the results here.
If you set it in bridge mode it will be the best thing, but often you will be then not be able to use
the VOIP and WiFi from that device, only as a hint!

Greetings from Germany
Frank

Offline Stewart

  • Full Member
  • ***
  • Posts: 252
  • Karma: +16/-2
    • View Profile
Re: Fluactuating CPU USAGE
« Reply #9 on: November 09, 2017, 12:55:02 pm »
Looks to me like the router is detecting an outage and attempting to restart services to bring it backup.  Since dpinger is showing the latency I wouldn't suspect it to be one of the other services unless you are pegged at 100% utilization on the box (which you aren't).  +1 for piBa's suggestion to check the quality graphs.  I suspect you'll see latency and packet loss in there.  See if they correlate to anything in the traffic graphs.  In certain instances maxing out your upload can cause it as well if the gateway is too busy processing packets to respond to an arp command from the router.  I would suspect the cpu spikes are an effect of the outage and not the cause.  The graphs and logs will point to which is which.