The pfSense Store

Author Topic: [v2.3 & v2.4] Kernel crash with Fatal trap 12: page fault while in kernel mode  (Read 3349 times)

0 Members and 1 Guest are viewing this topic.

Offline CDuv

  • Jr. Member
  • **
  • Posts: 89
  • Karma: +1/-0
    • View Profile
My 2.3.2-RELEASE-p1 appliance (Lanner FW-7551: Atom C2758 with 8GB of RAM) is crashing at least once per hour.

It reboots itself and resumes network access automatically.

The crash report is as follows:
Crash report begins.  Anonymous machine information:

amd64
10.3-RELEASE-p9
FreeBSD 10.3-RELEASE-p9 #1 5fc1b19(RELENG_2_3_2): Tue Sep 27 12:26:06 CDT 2016     root@ce23-amd64-builder:/builder/pfsense-232/tmp/obj/builder/pfsense-232/tmp/FreeBSD-src/sys/pfSense

Crash report details:

Filename: /var/crash/bounds
1

Filename: /var/crash/info.0
Dump header from device /dev/label/swap0
  Architecture: amd64
  Architecture Version: 1
  Dump Length: 80896B (0 MB)
  Blocksize: 512
  Dumptime: Wed Nov  2 13:21:44 2016
  Hostname: hermes.example.com
  Magic: FreeBSD Text Dump
  Version String: FreeBSD 10.3-RELEASE-p9 #1 5fc1b19(RELENG_2_3_2): Tue Sep 27 12:26:06 CDT 2016
    root@ce23-amd64-builder:/builder/pfsense-232/tmp/obj/builder/pfsense-232/tmp/FreeBSD-src/sys/pfSense
  Panic String: sbflush_internal: cc 4294965256 || mb 0 || mbcnt 0
  Dump Parity: 916287357
  Bounds: 0
  Dump Status: good


Full crash report.

Googling led me to Issue #4689 and freebsd-current mailing list but theses are old/resolved issues.

I have:
  • 2.3.2-RELEASE-p1 (amd64) - built on Tue Sep 27 12:13:07 CDT 2016 - FreeBSD 10.3-RELEASE-p9
  • 3 LANs (where 2 are VLAN interfaces)
  • 3 WANs with loadbalancing
  • Some (few) NATs rules
  • Some firewall rules
  • 6 services
    • dhcpd (enabled on 2 interfaces)
    • dpinger
    • ntpd
    • openvpn
    • sshd
    • unbound


Edit: Added hardware brand and model
« Last Edit: November 15, 2016, 03:22:37 am by CDuv »

Offline CDuv

  • Jr. Member
  • **
  • Posts: 89
  • Karma: +1/-0
    • View Profile
I am starting to think this is because of the OpenVPN service which uses the following settings:
  • DH Parameter length (bits): 2048
  • Encryption Algorithm: AES-256-CBC (256-bit)
  • Auth digest algorithm: SHA256 (256-bit)
  • Hardware Crypto: No Hardware Crypto Acceleration

I tried enabling "AES-NI CPU-based Acceleration" Cryptographic Hardware for system and "BSD cryptodev engine - RSA, DSA, DH, AES-128-CBC, AES-192-CBC, AES-256-CBC" Hardware Crypto for OpenVPN (as advised on IRC) but it does not prevent crash from occurring.

Box does not seems to overheat: I'm around 34C (cannot seems to be able to graph this thought)

Offline CDuv

  • Jr. Member
  • **
  • Posts: 89
  • Karma: +1/-0
    • View Profile
Spent the whole day with OpenVPN server disabled: and yet it crashed once, but only once (not 4 times a day as before)...

Offline CDuv

  • Jr. Member
  • **
  • Posts: 89
  • Karma: +1/-0
    • View Profile
According to Tuning and Troubleshooting Network Cards wiki entry I added:
kern.ipc.nmbclusters=1000000
to my /boot/loader.conf.local file.

It didn't changed anything... : still crashing.

Offline CDuv

  • Jr. Member
  • **
  • Posts: 89
  • Karma: +1/-0
    • View Profile
Re: Kernel crash with Fatal trap 12: page fault while in kernel mode
« Reply #4 on: November 11, 2016, 11:28:54 am »
I did a full re-installation of v2.3.2 from a USB memstick (Serial), then applied "-p1" patch.
I re-added
kern.ipc.nmbclusters=1000000
to my /boot/loader.conf.local file.

But it is still crashing (twice this night and once around noon: even if today no one was using it...).
I have a clone of this server (exact same model): same problem with this one too (so issue is not related to faulty hardware).


I have lots of crash report, but don't know how to read them: can someone help me?

Offline stedwsyy

  • Newbie
  • *
  • Posts: 1
  • Karma: +0/-0
    • View Profile
Re: Kernel crash with Fatal trap 12: page fault while in kernel mode
« Reply #5 on: November 12, 2016, 01:59:36 am »
This series of events is unlikely to happen.

Offline CDuv

  • Jr. Member
  • **
  • Posts: 89
  • Karma: +1/-0
    • View Profile
Re: Kernel crash with Fatal trap 12: page fault while in kernel mode
« Reply #6 on: November 12, 2016, 10:57:04 am »
That's why I am completely lost here...


Here is a map of the network installation:

    /------►(   Internet   )◄-------------\
    |               ▲                     |
    |               |                     |
┌─────┐        ┌─────┐                 ┌─────┐
│ISP B --\     │ISP C │                 │ISP A │
│router│  |     │router│                 │router│
└──────┘  |     └─────┘                 └─────┘
          |        |                          |
       (WAN_B)  (WAN_C)                       |
          |        |                          |
┌─────────────────────────────────┐      (WAN_A)
│        igb2     igb3              │         |
│                                   │         |
│  Lanner FW-7551 running pfSense   │         |
│                                   │         |
│                 VID3&VID4         │         |
│igb0    igb1        igb4      igb5 │         |
└───────────────────────────────┘         |
  |       |           |         |             |
(LAN)  (WAN_A)  (LAN_GUEST1)  (SYNC)          |
  |       |           &         |             |
  |       |     (LAN_GUEST2)    |             |
  |       |           |         |             |
  |       |           |         x             |
  |       |           |  unused yet: left     |
  |       |           |  for future HA        |
  |       |           |  pfSense setup        |
  |       |           |                       |
┌───────────────────────────────────────┐  |
│ p7     p13         p12                   │  |
│VID1    VID2     VID3&VID4                │  |
│                                          │  |
│         D-Link DGS managed switch      p9--/
│                                      VID2│
│ VID1                     VID3  VID4      │
│  p8                      p10   p11       │
└───────────────────────────────────────┘
   |                        |     |
   |                        |     \---► (LAN_GUEST2)
   ▼                        |
 (LAN)                      \---► (LAN_GUEST1)


Particularities:
  • I am using a VLAN for WAN_A (VID2) because ISP A's router only have one Ethernet port to connect my pfSense router to and I want to install a backup pfSense box: Thus I am using a managed switch and VLANs to virtually multiply Ethernet ports so that I can plug the backup server (this setup was working fine for years on previous pfSense server)
  • ISP A's router requires interface to be forced at "100 Mbps Full duplex", so igb1 and switch ports p13 and p9 are set at 100 Mbps Full duplex.
  • igb4 interface is the "host" of two virtual interfaces that uses VLANs : VID3 and VID4 for LAN_GUEST1and LAN_GUEST2
  • This is a multi-WAN with load balancing scenario
  • But WAN_B and WAN_C are currently unused/disconnected (interfaces are forced to "down")

Networks:
  • LAN: Network used by the clients (10.0.0.0/8)
    • pfSense has 10.0.0.5 (igb0 interface)
    • pfSense also uses a Virtual IP alias (igb0 interface) 10.0.0.254 that the LAN clients uses as their gateway
  • WAN_A: Network between pfSense and the router from my ISP "A" (80.26.35.12/30)
    • ISP A's router has 80.26.35.13
    • pfSense has 80.26.35.14 (igb1 interface), using 80.26.35.13 as the gateway
  • WAN_B: Network between pfSense and the router from ISP "B" (192.168.2.0/24)
  • WAN_C: Network between pfSense and the router from ISP "C" (192.168.3.0/24)
  • LAN_GUEST1: Network used by guests clients (192.168.10.0/24)
    • pfSense has 192.168.10.5 (igb0 interface)
    • pfSense also uses a Virtual IP alias (igb4 interface) 192.168.10.1 that the LAN_GUEST1 clients uses as their gateway
  • LAN_GUEST2: Network used by other guests clients (192.168.20.0/24)
    • pfSense has 192.168.20.5 (igb0 interface)
    • pfSense also uses a Virtual IP alias (igb4 interface) 192.168.20.1 that the LAN_GUEST2 clients uses as their gateway
  • SYNC: Future HA synchronization network
 

VLANs (VID<->Network mapping):
  • VID1: LAN
  • VID2: WAN_A
  • VID3: LAN_GUEST1
  • VID4: LAN_GUEST2


Today (non working day) it crashed about every two or three hours.
« Last Edit: November 12, 2016, 11:00:16 am by CDuv »

Offline beppo

  • Newbie
  • *
  • Posts: 24
  • Karma: +0/-0
    • View Profile
Re: Kernel crash with Fatal trap 12: page fault while in kernel mode
« Reply #7 on: November 12, 2016, 12:43:02 pm »
I got also your described problem with the supermicro board. The openvpn service is not enabled but I got daily reboots after crashs. For my thinking the issue is somehow related to the 2.3.x version as versions before ran months without problems.

Offline kpa

  • Hero Member
  • *****
  • Posts: 1203
  • Karma: +133/-6
    • View Profile
Re: Kernel crash with Fatal trap 12: page fault while in kernel mode
« Reply #8 on: November 12, 2016, 12:53:15 pm »
If the crash is always the same or looks very similar to the others it's likely that the problem is a software one, random crashes with wildly varying types of reports is an indication of a hardware problem instead.

Offline CDuv

  • Jr. Member
  • **
  • Posts: 89
  • Karma: +1/-0
    • View Profile
Re: Kernel crash with Fatal trap 12: page fault while in kernel mode
« Reply #9 on: November 12, 2016, 02:37:22 pm »
I don't know how to read crash reports.
Sometime the file "/var/crash/info.0" has:
Quote
Panic String: sbflush_internal: cc 0 || mb 0xfffff800643e2800 || mbcnt 2304
sometime it does not.
But crash report always has:
Quote
Fatal trap 12: page fault while in kernel mode

igb driver is Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k

Offline w0w

  • Sr. Member
  • ****
  • Posts: 559
  • Karma: +35/-6
  • kernel panic attack
    • View Profile
Re: Kernel crash with Fatal trap 12: page fault while in kernel mode
« Reply #10 on: November 13, 2016, 01:45:41 am »
Try different pfSense version, for example 2.2.6 or even 2.4-BETA, if problem persists its looks more like hardware issue, the fastest way is to try same config and version on other hardware, but it is possible only if you have another one.

Offline CDuv

  • Jr. Member
  • **
  • Posts: 89
  • Karma: +1/-0
    • View Profile
Re: Kernel crash with Fatal trap 12: page fault while in kernel mode
« Reply #11 on: November 13, 2016, 08:11:39 am »
OK, I'll try other versions: I have other hardware.

I guess testing v2.3.3 wouldn't do any better (since I doubt this unknown bug would get fixed).

I rather try v2.4 instead of v2.2 (to avoid loosing any feature v2.3 brought): would my v2.3.2-p1 configuration file be accepted on v2.4?

Offline CDuv

  • Jr. Member
  • **
  • Posts: 89
  • Karma: +1/-0
    • View Profile
Re: Kernel crash with Fatal trap 12: page fault while in kernel mode
« Reply #12 on: November 14, 2016, 04:29:40 am »
Should I disable "Flow Control" (as the Wiki says)

Offline w0w

  • Sr. Member
  • ****
  • Posts: 559
  • Karma: +35/-6
  • kernel panic attack
    • View Profile
Re: Kernel crash with Fatal trap 12: page fault while in kernel mode
« Reply #13 on: November 14, 2016, 01:39:57 pm »
Yes, 2.4 can use backup config from previous versions. You can  also try any other settings you find, not only flow control.

Offline CDuv

  • Jr. Member
  • **
  • Posts: 89
  • Karma: +1/-0
    • View Profile
Re: Kernel crash with Fatal trap 12: page fault while in kernel mode
« Reply #14 on: November 15, 2016, 03:22:07 am »
So, I tried 2.4.0-BETA v20161113-2326 (pfSense-CE-memstick-serial-2.4.0-BETA-amd64-20161113-2326), in 15 hours it failed twice (9h and 15h later).

When I logged in in the WebConfigurator to get the first crash report I got it fine:

Quote
               Crash report begins.  Anonymous machine information:

amd64
11.0-RELEASE-p3
FreeBSD 11.0-RELEASE-p3 #180 8fb831d(RELENG_2_4): Sun Nov 13 23:31:20 CST 2016     root@buildbot2.netgate.com:/builder/ce/tmp/obj/builder/ce/tmp/FreeBSD-src/sys/pfSense

Crash report details:

Filename: /var/crash/bounds
1

Filename: /var/crash/info.0
Dump header from device: /dev/ada0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 580517888
  Blocksize: 512
  Dumptime: Tue Nov 15 04:00:16 2016
  Hostname: pfsensebox.example.com
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 11.0-RELEASE-p3 #180 8fb831d(RELENG_2_4): Sun Nov 13 23:31:20 CST 2016
    root@buildbot2.netgate.com:/builder/ce/tmp/obj/builder/ce/tmp/FreeBSD-src/sys/pfSense
  Panic String: page fault
  Dump Parity: 1903556642
  Bounds: 0
  Dump Status: good

Filename: /var/crash/info.last
Dump header from device: /dev/ada0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 580517888
  Blocksize: 512
  Dumptime: Tue Nov 15 04:00:16 2016
  Hostname: pfsensebox.example.com
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 11.0-RELEASE-p3 #180 8fb831d(RELENG_2_4): Sun Nov 13 23:31:20 CST 2016
    root@buildbot2.netgate.com:/builder/ce/tmp/obj/builder/ce/tmp/FreeBSD-src/sys/pfSense
  Panic String: page fault
  Dump Parity: 1903556642
  Bounds: 0
  Dump Status: good

Filename: /var/crash/minfree
2048
            

but when sending report to developers, it got it's second crash which I don't want to send (because it would maybe re-crash the system) but I got this on the serial console:

Quote
Enter an option:
Message from syslogd@pfsensebox at Nov 15 10:01:15 ...
pfsensebox php-fpm[84587]: /index.php: Successful login for user 'admin' from: 10.0.1.53
panic: sbsndptr: sockbuf 0xfffff8010d811518 and mbuf 0xfffff8010ddc6000 clashing
cpuid = 6
Uptime: 6h0m53s
Dumping 567 out of 8135 MB: (CTRL-C to abort) ..3%..12%..23%..32%..43%..51%..63%..71%..82%..91%
Dump complete
                                                                             99
TAB Key on Remote Keyboard To Entry Setup Menu
MB-7551 Ver.AE0 03/28/2014
Version 2.16.1242. Copyright (C) 2013 American Megatrends, Inc.
Press <DEL> or <ESC> to enter setup.


(.. many empty lines ..)


|oading /boot/defaults/loader.conf serial port                                 
/IOS drive C: is disk0        /boot/config: -S115200 -D
BIOS 619kB/2081240kB available memory

FreeBSD/x86 bootstrap loader, Revision 1.1
(root@buildbot2.netgate.com, Wed Aug  3 08:04:25 CDT 2016)


(.. many empty lines ..)

/boot/entropy size=0x100017b93e]a0 |       
Booting... _/ ___|  ___ _ __  ___  ___     
Copyright (c) 1992-2016 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
  | .__/The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.0-RELEASE-p3 #180 8fb831d(RELENG_2_4): Sun Nov 13 23:31:20 CST 2016
    root@buildbot2.netgate.com:/builder/ce/tmp/obj/builder/ce/tmp/FreeBSD-src/sys/pfSense amd64


I have a 567MB file "/var/crash/vmcore.0".