pfSense Gold Subscription

Author Topic: igb 2.4.0 causing crashes  (Read 7406 times)

0 Members and 1 Guest are viewing this topic.

Offline Jason Litka

  • Hero Member
  • *****
  • Posts: 1294
  • Karma: +53/-1
    • View Profile
    • Utter Ramblings
igb 2.4.0 causing crashes
« on: January 29, 2014, 02:17:43 pm »
Ok, so my backup box at work keeps crashing with the new igb driver.  On one hand, better performance, on the other, uptime measured in hours...

I keep hitting the link to send a crash report but I've no idea where those go or who sees them.  Here's the data in the report.

EDIT: Code block isn't large enough, switching to pastebin.

http://pastebin.com/veci1em5
« Last Edit: January 29, 2014, 02:21:20 pm by Jason Litka »
I can break anything.

Offline Jason Litka

  • Hero Member
  • *****
  • Posts: 1294
  • Karma: +53/-1
    • View Profile
    • Utter Ramblings
Re: igb 2.4.0 causing crashes
« Reply #1 on: January 29, 2014, 02:33:34 pm »
This is actually REALLY easy to trigger.  If you run "iperf -s" on a 2.1 box and iperf -c" on the 2.1.1 then the 2.1.1 box will crash within 5 or 6 seconds.  If you run a simultaneous bidirectional test (from either side) with "-d" it will crash in less than a second.

Running across a NIC with a different driver, in my case an Intel 82599 (ix) doesn't result in a crash (though I do see performance well below wire speed as I mentioned in my thread in the Hardware section).
I can break anything.

Offline ermal

  • Hero Member
  • *****
  • Posts: 3832
  • Karma: +85/-5
    • View Profile
Re: igb 2.4.0 causing crashes
« Reply #2 on: January 29, 2014, 02:57:25 pm »
Can you please try to set
net.isr.direct: 0
net.isr.direct_force: 0

with sysctl and see if the panic repeats?

Also can you tell if TSO or LRO is active in your card?
« Last Edit: January 29, 2014, 03:08:49 pm by ermal »

Offline Jason Litka

  • Hero Member
  • *****
  • Posts: 1294
  • Karma: +53/-1
    • View Profile
    • Utter Ramblings
Re: igb 2.4.0 causing crashes
« Reply #3 on: January 29, 2014, 06:04:30 pm »
Still panics.

TSO & LRO are disabled on the System > Advanced > Networking page.  As to whether or not the driver is deciding to use them anyway, that I don't know.
I can break anything.

Offline qseb

  • Newbie
  • *
  • Posts: 10
  • Karma: +0/-0
    • View Profile
Re: igb 2.4.0 causing crashes
« Reply #4 on: February 04, 2014, 04:18:57 am »
Hello

no crash here with unidirectionnal or bidirectionnal iperf with 2.1.1 02/Feb
quad intel card + one i210 + one i217
no tweak except:
Code: [Select]
echo 'kern.ipc.nmbclusters="131072"'>> /boot/loader.conf

Offline Jason Litka

  • Hero Member
  • *****
  • Posts: 1294
  • Karma: +53/-1
    • View Profile
    • Utter Ramblings
Re: igb 2.4.0 causing crashes
« Reply #5 on: February 04, 2014, 06:47:50 am »
Are you doing the tests across the quad?  Is the quad an i350 card?  Maybe it's specific to those parts and not everything using igb.
I can break anything.

Offline qseb

  • Newbie
  • *
  • Posts: 10
  • Karma: +0/-0
    • View Profile
Re: igb 2.4.0 causing crashes
« Reply #6 on: February 04, 2014, 08:01:04 am »
yes quad is i350
I've just tested with i217 (em0) : no crash
I will test with i210 when possible

Offline adam65535

  • Sr. Member
  • ****
  • Posts: 326
  • Karma: +9/-0
    • View Profile
Re: igb 2.4.0 causing crashes
« Reply #7 on: February 10, 2014, 03:31:20 pm »
EDIT: Trying to clarify a few things and added more detail about the systems

I upgraded the backup of a HA cluster to 2.1.1 PRERELEASE (2nd of February snapshot) and it locked up on the 8th around noon time (Saturday).  The systems are in a test environment.  The last thing on the screen is just showing successful login from the 5th.  Keyboard caps lights don't work, etc.

I have a HA (primary / standby) firewall (non-production but running production config) where both primary and master was running 2.0.3 for 2 to 3 months perfectly with idle to light load up but with a few days of intense bandwidth testing with iperf about 3 weeks ago.  After that I upgraded only the backup on the 3rd to the 2nd of Feb snapshot of 2.1.1.  I disabled all syncing before that except state syncing to be safe from that point on.  I did performance testing with iperf on the 2nd to the 3rd going back and forth testing each member as primary.  I did not reboot them after doing that.  I couldn't get it to crash running iperf and on those days so I left the backup as primary from that point on mainly idle.  I did that mainly because of this thread and the other one about the ix driver by Jason Litka having issues for him.

No testing was being done from the 4th to current day so there should have been hardly nothing going on with the HA pair.  The only odd thing I am doing is I disabled sync for everything except the state syncing.   I didn't want the primary (2.0.3) to send a config change to the backup (2.1.1) and accidentally change something that 2.1.1 didn't understand.  I assumed pfsync was compatible.  I wanted to compare performance between 2.0.3 and 2.1.1 PRERELEASE and stability of 2.1.1.  I did also disable CARP on the primary and left the backup as active to let the backup run for awhile as primary.

I am using the igb driver.  The hardware is new Dell R320 firewalls with Intel PRO/1000 PT quad port cards. Onboard NICs are disabled.  Hyperthreading disabled (4 CPUs after doing that).

/boot/loader.conf.local
kern.ipc.nmbclusters="131072"
hw.igb.num_queues=2
hw.igb.txd="4096"
hw.igb.rxd="4096"
hw.igb.rx_process_limit="1000"

2.1.1-PRERELEASE (amd64)
built on Sun Feb 2 14:47:20 EST 2014
FreeBSD 8.3-RELEASE-p14

Note:  the pfsense 2.1.1 igb driver seems to ignore the hw.igb.num_queues=2 and sets up 4 instead.

I am going to try and reproduce it of course.
« Last Edit: February 10, 2014, 04:32:35 pm by adam65535 »

Offline nastraga

  • Jr. Member
  • **
  • Posts: 39
  • Karma: +0/-0
    • View Profile
Re: igb 2.4.0 causing crashes
« Reply #8 on: February 12, 2014, 11:10:19 am »
Can confirm crash with i350 card, iperf < 100mbps traffic to another host

2.1.1-PRERELEASE (amd64)
built on Tue Feb 11 22:10:25 EST 2014
FreeBSD 8.3-RELEASE-p14

default config options, 1 interface defined and in use (igb0)

Platform IBM x3650m3

Submitted a crash report via gui

i350 card is stable to port saturation on all ports under FreeBSD 10

Offline adam65535

  • Sr. Member
  • ****
  • Posts: 326
  • Karma: +9/-0
    • View Profile
Re: igb 2.4.0 causing crashes
« Reply #9 on: February 18, 2014, 10:04:54 am »
I haven't been able to reproduce the lock up again doing iperf tests on 2.1.1 PRERELEASE so I am unsure if it was related to this issue or not.  I might have a different issue or not :).  I don't have a box with 1.2.2 PRERELEASE using an igb driver in any kind of real environment yet.

I did just notice a commit to pfsense-tools though that seems to indicate they are going back to the old drivers.  I am not 100% sure though that the commit means that as I don't know the internal build stuff but it looks like it to me.

https://github.com/pfsense/pfsense-tools/commit/fde16db5dd82641544017d2a2b2b1e04d5332ec4

builder_scripts/conf/patchlist/patches.RELENG_8_3:
"Disable the ndrivers from head they seem to break things more than help in general"
-~~inet_head.tgz~
 -~sys/conf~files.8.3.diff~

EDIT: I didn't check the 2.1.1 forum to notice the sticky...  it has been reverted.
https://forum.pfsense.org/index.php/topic,72763.0.html
« Last Edit: February 18, 2014, 10:17:25 am by adam65535 »

Offline ermal

  • Hero Member
  • *****
  • Posts: 3832
  • Karma: +85/-5
    • View Profile
Re: igb 2.4.0 causing crashes
« Reply #10 on: March 10, 2014, 04:29:21 pm »
Give it another shot with new snapshots.

The panics have been resolved and let us know.

Offline adam65535

  • Sr. Member
  • ****
  • Posts: 326
  • Karma: +9/-0
    • View Profile
Re: igb 2.4.0 causing crashes
« Reply #11 on: March 10, 2014, 05:04:44 pm »
EDIT: I just realized you were probably not talking about my lockup as that seemed to be a different issue...

I never was able to reproduce this specific lockup (not crash).  The only crash issue I have is related to disabling carp on the master while under load which happens even with pfsense 2.0.3.  It happens to 2 different identical hardware installs.  Since it happens on 2.0.3 too I didn't bring it up here.  The crash is with the reverted igb drivers (like in current snapshots) and not the backported drivers which were pulled back out somewhat recently.

https://forum.pfsense.org/index.php?topic=72965.0

Offline Jason Litka

  • Hero Member
  • *****
  • Posts: 1294
  • Karma: +53/-1
    • View Profile
    • Utter Ramblings
Re: igb 2.4.0 causing crashes
« Reply #12 on: March 11, 2014, 08:01:12 pm »
Give it another shot with new snapshots.

The panics have been resolved and let us know.

Is it in the current snapshots?  I can install Friday and give it a test.  Maybe Thursday.
I can break anything.

Offline ermal

  • Hero Member
  • *****
  • Posts: 3832
  • Karma: +85/-5
    • View Profile
Re: igb 2.4.0 causing crashes
« Reply #13 on: March 12, 2014, 09:42:22 am »
Yes it is in the latest ones.

Offline Jason Litka

  • Hero Member
  • *****
  • Posts: 1294
  • Karma: +53/-1
    • View Profile
    • Utter Ramblings
Re: igb 2.4.0 causing crashes
« Reply #14 on: March 12, 2014, 10:47:25 am »
Yes it is in the latest ones.

I'm not getting any snapshots newer than what I'm on (Fri Mar 7 18:35:38 EST 2014).
I can break anything.