pfSense Gold Subscription

Author Topic: HEADS UP: Textdumps coming in next snapshots to aid debugging  (Read 8276 times)

0 Members and 1 Guest are viewing this topic.

Offline jimp

  • Administrator
  • Hero Member
  • *****
  • Posts: 14931
    • View Profile
HEADS UP: Textdumps coming in next snapshots to aid debugging
« on: January 27, 2011, 11:53:43 am »
Starting with the next snapshots dated after this post, textdump support will be in all kernels for full installs.

Embedded systems lack the swap space necessary for dumps to work, so debugging with them will still require switching to a debug kernel.

So what are textdumps?
In short, they automatically capture information about a crash: The panic message, backtrace, and other info about the system state at the time of the crash.
For more in-depth detail, see here

After a crash, when your system reboots, you will be left with the textdump data in tar format under /var/crash, like so:
Code: [Select]
-rw-------  1 root  wheel  123904 Jan 27 17:01 textdump.tar.0If there are multiple textdumps, the number at the end will increment. The highest number (and of course latest date) are always the most recent data.

If asked by a developer, you can submit the whole tar file, or untar the file and copy/paste the individual files into forum attachments. To unpack the tar file on the router, you can do something like this:

Code: [Select]
[2.0-BETA5][root@pfsense.localdomain]/root(1): mkdir crash
[2.0-BETA5][root@pfsense.localdomain]/root(2): cd crash/
[2.0-BETA5][root@pfsense.localdomain]/root/crash(3): tar xvf /var/crash/textdump.tar.0
x ddb.txt
x config.txt
x msgbuf.txt
x panic.txt
x version.txt
[2.0-BETA5][root@pfsense.localdomain]/root/crash(4): ls -l
total 122
-rw-------  1 root  wheel   5289 Jan 27 17:00 config.txt
-rw-------  1 root  wheel  49152 Jan 27 17:00 ddb.txt
-rw-------  1 root  wheel  65508 Jan 27 17:00 msgbuf.txt
-rw-------  1 root  wheel     16 Jan 27 17:00 panic.txt
-rw-------  1 root  wheel    138 Jan 27 17:00 version.txt

Or copy the tar file off with scp and unpack it locally.

The files included in the dump are:

* config.txt - Kernel configuration
* ddb.txt - Captured DDB output (including the backtrace and other helpful info)
* msgbuf.txt - Kernel message buffer, should be roughly equivalent to the system.log file.
* panic.txt - Kernel panic message, if there was a panic
* version.txt - Kernel version string

You may still need to switch to a debug kernel to get more detail about locks.

Hopefully in the near future, none of this will be necessary :-)
Need help fast? Commercial Support!

Co-Author of pfSense: The Definitive Guide. - Check the Doc Wiki for FAQs.

Do not PM for help!

Offline David Szpunar

  • Full Member
  • ***
  • Posts: 165
    • View Profile
    • David Szpunar
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #1 on: January 28, 2011, 10:11:41 am »
Well that's nice to hear! I had a system I upgraded from a Nov. snapshot to one yesterday and it started locking up every 20 minutes, with no indication of what the problem was. Had to reinstall the Nov. snapshot and restore config, now it's fine again. I also had a VMware deployment I did last week that I had to revert back to the previous firewall because as soon as I put it in production and connected to cable modem, it started locking up. But no indication of what the problem was, so this is great to hear we'll be able to grab logs when that happens, thanks!
David Szpunar
I use pfSense wherever I can, and I break the rule about not using 2.0 beta in production, because it's so cool :-)

Offline jimp

  • Administrator
  • Hero Member
  • *****
  • Posts: 14931
    • View Profile
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #2 on: January 28, 2011, 10:15:18 am »
Well, this won't help with hangs/lockups (unfortunately) but it will grab data from panics, which some people may just see as "spontaneous reboots" because they may not see the panic message on the console due to the automatic reboot.

At some point before release these dumps will probably be scaled back to only being in the dev kernel again as the changes required to make this work increased the kernel sizes significantly.
Need help fast? Commercial Support!

Co-Author of pfSense: The Definitive Guide. - Check the Doc Wiki for FAQs.

Do not PM for help!

Offline David Szpunar

  • Full Member
  • ***
  • Posts: 165
    • View Profile
    • David Szpunar
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #3 on: January 28, 2011, 10:20:51 am »
Boo...any suggestions on the hangs/lockups, or am I out of luck? I keep meaning to post a question in the forum but a) I have no actual information, the System Logs are clean and b) I haven't had time. I assume I'd need a debug kernel at the least but I haven't messed with those at all, and I don't know if they'd help.
David Szpunar
I use pfSense wherever I can, and I break the rule about not using 2.0 beta in production, because it's so cool :-)

Offline jimp

  • Administrator
  • Hero Member
  • *****
  • Posts: 14931
    • View Profile
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #4 on: January 28, 2011, 10:24:32 am »
Join the party on this thread:

PFSense 2.0 Beta5 1/19 build system locks up

It's something in the FTP proxy, it's being actively pursued and should hopefully be fixed soon...
Need help fast? Commercial Support!

Co-Author of pfSense: The Definitive Guide. - Check the Doc Wiki for FAQs.

Do not PM for help!

Offline David Szpunar

  • Full Member
  • ***
  • Posts: 165
    • View Profile
    • David Szpunar
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #5 on: January 28, 2011, 10:25:35 am »
Thanks, I saw that right after my above post and already have half a post typed up :-)
David Szpunar
I use pfSense wherever I can, and I break the rule about not using 2.0 beta in production, because it's so cool :-)

Offline rcfa

  • Sr. Member
  • ****
  • Posts: 565
    • View Profile
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #6 on: January 28, 2011, 12:50:52 pm »
Great! May I suggest an auto-rotate feature, where the numbered dumps are archival copies with the highest number being the oldest log?

This way dumping a log doesn't require messing with file names, it's always going to be the one without number, while the proper re-numbering happens during the system reboot.

More importantly: this allows to set a limit on how many are archived, such that any dumps exceeding that number are being deleted in the same swoop. Otherwise there is a chance that a not too generously measured system disk fills up with these dumps with unsuspecting users...

Offline jimp

  • Administrator
  • Hero Member
  • *****
  • Posts: 14931
    • View Profile
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #7 on: January 28, 2011, 12:55:09 pm »
Well this feature isn't going to stick with the normal kernels upon release, only the dev kernels.

Adding some kind of rotation might be OK (or just a symlink to the newest one would be best) but for something that is hopefully a very rare occurrence it may not be worth the effort.

The way they are numbered now is handled by the savecore utility and that isn't something we'd want to start making custom changes it, it's just the way it works on FreeBSD. Anything we come along and change after savecore is done may alter the behavior of savecore as well... Unless we move the crash info to somewhere else (like /root/crash/) and always leave /var empty after savecore is run.

Might look into doing that at some point, but it sounds like a fair amount of effort for not a lot of gain.
Need help fast? Commercial Support!

Co-Author of pfSense: The Definitive Guide. - Check the Doc Wiki for FAQs.

Do not PM for help!

Offline wallabybob

  • Hero Member
  • *****
  • Posts: 5262
    • View Profile
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #8 on: January 28, 2011, 09:02:58 pm »
I installed 2.0-BETA5 (i386)
built on Thu Jan 27 20:55:04 EST 2011 (manual upgrade from build on 1 Jan 2011) and it crashed a few hours after installation. It stopped at bt> prompt so I gave the command call doadump to write a crash dump. On reboot I saw savecore report it was writing a crash dump. I now have a vmcore.0 in /var/crash but no textdump.tar.0.

Something gone wrong?

Offline jimp

  • Administrator
  • Hero Member
  • *****
  • Posts: 14931
    • View Profile
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #9 on: January 28, 2011, 09:04:54 pm »
Sounds like the textdump code didn't initiate the ddb scripts like it should. No other errors in the boot log (way at the top before the message about writing out the core)

Though there are newer snapshots out now, they may behave differently.
Need help fast? Commercial Support!

Co-Author of pfSense: The Definitive Guide. - Check the Doc Wiki for FAQs.

Do not PM for help!

Offline wallabybob

  • Hero Member
  • *****
  • Posts: 5262
    • View Profile
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #10 on: January 28, 2011, 11:59:24 pm »
No other errors in the boot log (way at the top before the message about writing out the core)
It should be in /var/log/system.log? It wasn't in the dmesg output. /var/log/system.log went back only 8 hours (it was full of the once a minute noise from hostapd: group key handshake completed).

Looks like a "significant" sysctl variable got set on reload:
Quote
# sysctl -a | grep panic
kern.sync_on_panic: 0
kdb.enter.panic=textdump set; capture on; run lockinfo; show pcpu; bt; ps; alltrace; capture off; call doadump; reset
debug.ddb.textdump.do_panic: 1
debug.trace_on_panic: 0
debug.debugger_on_panic: 1
debug.kdb.panic: 0
machdep.enable_panic_key: 0
machdep.panic_on_nmi: 1
#
Maybe I'll upgrade to a newer snapshot in the next couple of days.

Offline jimp

  • Administrator
  • Hero Member
  • *****
  • Posts: 14931
    • View Profile
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #11 on: January 29, 2011, 12:15:33 am »
No, the console errors I'm talking about would be way before the system log starts, they would have been echo'd to the console, not syslog.

From those sysctls it looks like it should have been OK though.
Need help fast? Commercial Support!

Co-Author of pfSense: The Definitive Guide. - Check the Doc Wiki for FAQs.

Do not PM for help!

Offline wallabybob

  • Hero Member
  • *****
  • Posts: 5262
    • View Profile
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #12 on: April 22, 2011, 07:39:35 am »
I recently upgraded one of my home systems to pfSense 2.0-RC1 (i386) built on Mon Apr 18 23:29:41 EDT 2011. Since enabling Captive Portal I've had a few panics (kernel page faults). On every occasion the system entered the debugger displaying the dbg> prompt. I type bt and call doadump but on reboot I have not seen any textdump.tar files in /var/crash:
Quote
# ls -l /var/crash
total 167018
-rw-r--r--  1 root  wheel         2 Apr 22 20:31 bounds
-rw-------  1 root  wheel       485 Apr 19 23:04 info.0
-rw-------  1 root  wheel       485 Apr 22 20:22 info.1
-rw-------  1 root  wheel       486 Apr 22 20:31 info.2
-rw-r--r--  1 root  wheel         5 Apr 19 13:20 minfree
-rw-------  1 root  wheel  68562944 Apr 19 23:04 vmcore.0
-rw-------  1 root  wheel  51036160 Apr 22 20:22 vmcore.1
-rw-------  1 root  wheel  71602176 Apr 22 20:31 vmcore.2
# more /var/crash/info.2
Dump header from device /dev/ad0s1b
  Architecture: i386
  Architecture Version: 2
  Dump Length: 71602176B (68 MB)
  Blocksize: 512
  Dumptime: Fri Apr 22 20:23:32 2011
  Hostname: pfsense2.example.org
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 8.1-RELEASE-p2 #1: Mon Apr 18 23:21:20 EDT 2011
    sullrich@FreeBSD_8.0_pfSense_2.0-snaps.pfsense.org:/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense.8
  Panic String:
  Dump Parity: 2021776436
  Bounds: 2
  Dump Status: good
#

The following sysctl variables are set:
Quote
# sysctl -a | grep panic
kern.sync_on_panic: 0
kdb.enter.panic=textdump set; capture on; run lockinfo; show pcpu; bt; ps; alltrace; capture off; call doadump; reset
debug.ddb.textdump.do_panic: 1
debug.trace_on_panic: 0
debug.debugger_on_panic: 1
debug.kdb.panic: 0
machdep.enable_panic_key: 0
machdep.panic_on_nmi: 1
#

On reading the textdump man page it looks as if I also need to do:
Quote
# sysctl debug.ddb.textdump.pending=1
debug.ddb.textdump.pending: 0 -> 1

Offline jimp

  • Administrator
  • Hero Member
  • *****
  • Posts: 14931
    • View Profile
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #13 on: April 22, 2011, 07:41:08 am »
For some reason certain systems (haven't been able to figure out which) aren't making proper textdumps. If they happen, they should be happening automatically. Try switching to a debug kernel to see if that helps:

http://doc.pfsense.org/index.php/Switching_Kernels

If not, switch back to the SMP kernel afterward.
Need help fast? Commercial Support!

Co-Author of pfSense: The Definitive Guide. - Check the Doc Wiki for FAQs.

Do not PM for help!

Offline wallabybob

  • Hero Member
  • *****
  • Posts: 5262
    • View Profile
Re: HEADS UP: Textdumps coming in next snapshots to aid debugging
« Reply #14 on: April 22, 2011, 04:50:11 pm »
The system crashed again with a kernel page fault at the same PC. Despite having set debug.ddb.textdump.pending=1 it didn't reboot automatically so Ityped in the commands in kdb.enter.panic one per line and then on reboot savecore reported it was writing textdump.tar.

This system has a VIA C3 CPU and is running
Quote
# dmesg
Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
   The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.1-RELEASE-p2 #1: Mon Apr 18 23:21:20 EDT 2011
    sullrich@FreeBSD_8.0_pfSense_2.0-snaps.pfsense.org:/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense.8 i386
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: VIA Samuel 2 (797.74-MHz 686-class CPU)
  Origin = "CentaurHauls"  Id = 0x673  Family = 6  Model = 7  Stepping = 3
  Features=0x803035<FPU,DE,TSC,MSR,MTRR,PGE,MMX>
real memory  = 268435456 (256 MB)

I'll switch to a debug kernel as suggested.