Netgate SG-1000 microFirewall

Author Topic: Unbelieveably bad performance  (Read 6251 times)

0 Members and 1 Guest are viewing this topic.

Offline Douglas Haber

  • Jr. Member
  • **
  • Posts: 41
  • Karma: +0/-0
    • View Profile
Re: Unbelieveably bad performance
« Reply #15 on: January 16, 2015, 08:44:02 am »
Just to confirm, you've definitely not fallen foul of the driver change issue I linked to? I can't really see why it would affect you since you're not using VLANs or anything other than a standard config but it's worth checking.

Steve

I missed your question. Probably.

It was not xn in 2.1.5, it was re(4)

Hrmm.. found this on the ML:

http://lists.freebsd.org/pipermail/freebsd-xen/2014-April/002065.html

Maybe FreeBSD 10 just does not play nice on Xen.

Edit 2 - more quirks involving XS..

http://lists.freebsd.org/pipermail/freebsd-xen/2014-February/002010.html
« Last Edit: January 16, 2015, 08:50:35 am by Douglas Haber »

Offline stephenw10

  • Administrator
  • Hero Member
  • *****
  • Posts: 12309
  • Karma: +494/-15
    • View Profile
Re: Unbelieveably bad performance
« Reply #16 on: January 16, 2015, 08:51:37 am »
Hmm, well that's interesting. You specified Realtek emulation in the Xen config then I assume? I'm unfamiliar with Xen.
I would try removing the paravirtualised NIC support in Xen so that pfSense goes back to using the re driver and see if that makes any difference. Additionally I would set it to emulate Intel NICs rather than Realtek.
As I say though I can't really see why the xn driver should be causing problems in your basic setup. Try removing all the hardware offloading options in System: Advanced: Networking:

Steve

Offline Douglas Haber

  • Jr. Member
  • **
  • Posts: 41
  • Karma: +0/-0
    • View Profile
Re: Unbelieveably bad performance
« Reply #17 on: January 16, 2015, 08:52:53 am »
Hmm, well that's interesting. You specified Realtek emulation in the Xen config then I assume? I'm unfamiliar with Xen.
I would try removing the paravirtualised NIC support in Xen so that pfSense goes back to using the re driver and see if that makes any difference. Additionally I would set it to emulate Intel NICs rather than Realtek.
As I say though I can't really see why the xn driver should be causing problems in your basic setup. Try removing all the hardware offloading options in System: Advanced: Networking:

Steve

Realtek is the default with XenServer. Switching to Intel emulation requires some hackery I am not ready to be doing yet. I don't want to change Xen necessarily.

EDIT: By hackery, I mean just a small change really (http://www.netservers.co.uk/articles/open-source-howtos/citrix_e1000_gigabit) but I also have other VM's running, and don't want to change too much.

I found this, which is interesting..

Quote
ssh from the Windows PV host to the FreeBSD PV DomU host appears to work
fine. Attempting to 'route' traffic from the Windows PV host 'through' the
FreeBSD PV DomU fails - pings go, DNS goes, initial TCP 'setups' go - but
stuff dies thereafter (i.e. may be packet size related or something).

Sounds pretty much like my issue (re: http not working) even though as another poster mentioned, requests are there.

http://lists.freebsd.org/pipermail/freebsd-xen/2014-February/002018.html
« Last Edit: January 16, 2015, 09:10:26 am by Douglas Haber »

Online johnpoz

  • Hero Member
  • *****
  • Posts: 15775
  • Karma: +1503/-210
  • Not a pfSense employee, they cannot fire me...
    • View Profile
Re: Unbelieveably bad performance
« Reply #18 on: January 16, 2015, 09:16:39 am »
ok this looks different than before..

So looks like your getting back the syn,ack..  But then when you send a get, a 404 is sent back..  But then that is not working..


GET / HTTP/1.1
Host: 65.98.6.38
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8

HTTP/1.1 404 Not Found
Date: Fri, 16 Jan 2015 13:45:34 GMT
Server: Apache/2.2.22 (Debian)

Then on the lan side you don't see the get??  Something really odd going on here..

From your wan sniff you can see that 404 was sent, but then you see retrans on the get and 404.  But on the lan side not even seeing the get..  Were these sniffs taken at the same time?

edit: Ok looks like these were taken at different times..  wan goes from 7:45:31 to 7:47:14  But lan is from 7:47:31 to 7:49:16...  You really need to take capture at the same time.. And wouldn't hurt to have sniff running over the same time period on the webserver.

« Last Edit: January 16, 2015, 09:23:11 am by johnpoz »
- An intelligent man is sometimes forced to be drunk to spend time with his fools.
- Please don't PM me for personal help
- if you want to say thanks applaud or https://www.freebsdfoundation.org/donate/
1x SG-2440 2.4.3-RELEASE (work)
1x SG-3100 2.4.3-RELEASE (work)
1x SG-4860 2.4.3-RELEASE (home)

Offline Douglas Haber

  • Jr. Member
  • **
  • Posts: 41
  • Karma: +0/-0
    • View Profile
Re: Unbelieveably bad performance
« Reply #19 on: January 16, 2015, 09:21:32 am »
ok this looks different than before..

So looks like your getting back the syn,ack..  But then when you send a get, a 404 is sent back..  But then that is not working..


GET / HTTP/1.1
Host: 65.98.6.38
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8

HTTP/1.1 404 Not Found
Date: Fri, 16 Jan 2015 13:45:34 GMT
Server: Apache/2.2.22 (Debian)

Then on the lan side you don't see the get??  Something really odd going on here..

From your wan sniff you can see that 404 was sent, but then you see retrans on the get and 404.  But on the lan side not even seeing the get..  Were these sniffs taken at the same time?

1) the 404 is to be expected. i wanted a simple thing to be spit back for testing purposes, rather than several MB webpage ,which is what would be on it in production. there is nothing to be served on the webserver now.

2) very close.  couple of seconds apart max. i'll work on a set up exact same time ones.
« Last Edit: January 16, 2015, 09:24:40 am by Douglas Haber »

Online johnpoz

  • Hero Member
  • *****
  • Posts: 15775
  • Karma: +1503/-210
  • Not a pfSense employee, they cannot fire me...
    • View Profile
Re: Unbelieveably bad performance
« Reply #20 on: January 16, 2015, 09:24:00 am »
no they are not a couple of seconds apart.. they are completely different time frames.  See my edit.
- An intelligent man is sometimes forced to be drunk to spend time with his fools.
- Please don't PM me for personal help
- if you want to say thanks applaud or https://www.freebsdfoundation.org/donate/
1x SG-2440 2.4.3-RELEASE (work)
1x SG-3100 2.4.3-RELEASE (work)
1x SG-4860 2.4.3-RELEASE (home)

Offline Douglas Haber

  • Jr. Member
  • **
  • Posts: 41
  • Karma: +0/-0
    • View Profile
Re: Unbelieveably bad performance
« Reply #21 on: January 16, 2015, 09:24:24 am »
no they are not a couple of seconds apart.. they are completely different time frames.  See my edit.

I'll run a new set, same time. Hang on.

Offline Douglas Haber

  • Jr. Member
  • **
  • Posts: 41
  • Karma: +0/-0
    • View Profile
Re: Unbelieveably bad performance
« Reply #22 on: January 16, 2015, 09:36:43 am »
Same URL's. Same time. Literally within 1-2 seconds this time, as quick as I could move cursor and hit go.

No webserver capture in this group, though

EDIT: let me see if i can do it again and turn up verbosity on pfsense, it's capture is way way less verbose with the LAN interface than my tcpdump was for the WAN

Online johnpoz

  • Hero Member
  • *****
  • Posts: 15775
  • Karma: +1503/-210
  • Not a pfSense employee, they cannot fire me...
    • View Profile
Re: Unbelieveably bad performance
« Reply #23 on: January 16, 2015, 09:39:15 am »
well wan is going to see all the noise of a typical wan connection ;)  I would expect to see lots of noise ;)
- An intelligent man is sometimes forced to be drunk to spend time with his fools.
- Please don't PM me for personal help
- if you want to say thanks applaud or https://www.freebsdfoundation.org/donate/
1x SG-2440 2.4.3-RELEASE (work)
1x SG-3100 2.4.3-RELEASE (work)
1x SG-4860 2.4.3-RELEASE (home)

Offline Douglas Haber

  • Jr. Member
  • **
  • Posts: 41
  • Karma: +0/-0
    • View Profile
Re: Unbelieveably bad performance
« Reply #24 on: January 16, 2015, 09:39:59 am »
well wan is going to see all the noise of a typical wan connection ;)  I would expect to see lots of noise ;)

I forgot to take of the default limit of 100 packets on the pf capture.  :-X

Redoing now

Offline Douglas Haber

  • Jr. Member
  • **
  • Posts: 41
  • Karma: +0/-0
    • View Profile
Re: Unbelieveably bad performance
« Reply #25 on: January 16, 2015, 09:47:46 am »
well wan is going to see all the noise of a typical wan connection ;)  I would expect to see lots of noise ;)

Correctly done dumps are there now.

Offline marcelloc

  • Hero Member
  • *****
  • Posts: 13704
  • Karma: +609/-8
    • View Profile
Re: Unbelieveably bad performance
« Reply #26 on: January 16, 2015, 12:45:29 pm »
Are you using xentools on this vm?

http://blog.feld.me/posts/2014/07/pfsense-on-citrix-xenserver/

I've played with a 2.2 beta version on xen server with ~800mbit throughput IIRC.

Online johnpoz

  • Hero Member
  • *****
  • Posts: 15775
  • Karma: +1503/-210
  • Not a pfSense employee, they cannot fire me...
    • View Profile
Re: Unbelieveably bad performance
« Reply #27 on: January 16, 2015, 01:57:46 pm »
Ok so looking at these dumps..

You have two connections coming in to 80, one from source port 43293 and another on 27618 both from this 67.81.220.99 IP

You see the syn,ack back and then the ack from the 43293 connection.  But you never see the ack from the syn,ack sent to 27618

You also see a get, an ack to that and then sending of the 404..  Clearly you can see the stuff pfsense gets on its wan it sends on to the lan.  Stuff it sees on the lan it sends out the wan.

I see pfsense doing what it is suppose to do, it forwards on the packets..  But then on the wan side it seems that box is not getting the responses what were sent, so it sends retrans..  And on the lan side it doesn't get the reponse it expected so it retrans.

Looks to me you have a issue with communication on the wan side..

So you see the get come in on wan, you set it sent on to the lan, you see the lan ack back, you see it send 404..  But then you see inbound from 220.99 saying hey Im going to resend this get because I never got an ack..  And it clearly didn't get the 404 that was sent.

Pfsense from your sniff clearly put it on the wire - but seems to be getting lost..  And 220.99 is not getting it.
- An intelligent man is sometimes forced to be drunk to spend time with his fools.
- Please don't PM me for personal help
- if you want to say thanks applaud or https://www.freebsdfoundation.org/donate/
1x SG-2440 2.4.3-RELEASE (work)
1x SG-3100 2.4.3-RELEASE (work)
1x SG-4860 2.4.3-RELEASE (home)

Offline cmb

  • Hero Member
  • *****
  • Posts: 11226
  • Karma: +896/-7
    • View Profile
    • Chris Buechler
Re: Unbelieveably bad performance
« Reply #28 on: January 16, 2015, 01:59:27 pm »
The LAN capture has broken TCP checksums on all the retransmitted traffic. Not on everything though, and not null checksums (which would be the scenario where it's capturing before the NIC's checksum offloading adds the checksum), which suggests that's the likely cause. Have you disabled hardware checksum offloading under System>Advanced, Networking tab? Probably best to reboot afterwards.

Offline Douglas Haber

  • Jr. Member
  • **
  • Posts: 41
  • Karma: +0/-0
    • View Profile
Re: Unbelieveably bad performance
« Reply #29 on: January 17, 2015, 08:03:43 am »
Are you using xentools on this vm?

http://blog.feld.me/posts/2014/07/pfsense-on-citrix-xenserver/

I've played with a 2.2 beta version on xen server with ~800mbit throughput IIRC.

I had/have same issue tools or not.

edit: throughput on the pfsense VM itself has been perfect this entire time. no slowness at all. it's only VM's behind the VM.

Ok so looking at these dumps..

You have two connections coming in to 80, one from source port 43293 and another on 27618 both from this 67.81.220.99 IP

You see the syn,ack back and then the ack from the 43293 connection.  But you never see the ack from the syn,ack sent to 27618

You also see a get, an ack to that and then sending of the 404..  Clearly you can see the stuff pfsense gets on its wan it sends on to the lan.  Stuff it sees on the lan it sends out the wan.

I see pfsense doing what it is suppose to do, it forwards on the packets..  But then on the wan side it seems that box is not getting the responses what were sent, so it sends retrans..  And on the lan side it doesn't get the reponse it expected so it retrans.

Looks to me you have a issue with communication on the wan side..

So you see the get come in on wan, you set it sent on to the lan, you see the lan ack back, you see it send 404..  But then you see inbound from 220.99 saying hey Im going to resend this get because I never got an ack..  And it clearly didn't get the 404 that was sent.

Pfsense from your sniff clearly put it on the wire - but seems to be getting lost..  And 220.99 is not getting it.

Not sure where the issue is then, if it is "WAN side", since every other box connected to that hand off from the datacenter is experiencing no issues whatsoever, and as previously stated, FreeBSD 10 (or I guess pfSense 2.2) is the only thing experiencing issue. The same exact WAN uplink/cable/etc in the same hypervisor can do full line rate in the other VM's.

The LAN capture has broken TCP checksums on all the retransmitted traffic. Not on everything though, and not null checksums (which would be the scenario where it's capturing before the NIC's checksum offloading adds the checksum), which suggests that's the likely cause. Have you disabled hardware checksum offloading under System>Advanced, Networking tab? Probably best to reboot afterwards.

I did disable it, but haven't tried rebooting. Trying now.
« Last Edit: January 17, 2015, 08:16:44 am by Douglas Haber »