Netgate SG-1000 microFirewall

Author Topic: Another thread about low bandwidth with VMware ESXi  (Read 221 times)

0 Members and 1 Guest are viewing this topic.

Offline agbiront

  • Newbie
  • *
  • Posts: 7
  • Karma: +0/-0
    • View Profile
Another thread about low bandwidth with VMware ESXi
« on: June 16, 2017, 08:57:23 am »
Hi,

First of all, let me apologize for making another thread on the same subject. I've already read all the info I could gather throu the forum, and tried several things and can't seem to figure out where the issue is.

I have a home ESXi lab. I'll try to explain everything and what I want to achieve with pfSense. But let's start with the easiest explanation and sceneario:

A VM on my network (VLAN20) connects to internet at full speed (40mbps up / 40mbps down). The same VM behind pfSense (VLAN40) has only 5Mbps upload.

What I tried:

Swaping vNCIs: E1000 and VMXNET3
Swaping pNICs: I have Intel PRO1000 server adapters and Realtek 1Gbps adapters
Modify Hardware Offloading: Checksum, segmentation, large receive
Open VM Tools: Installed and uninstalled
Installing a new pfSense VM

*********************
My hardware:
Ubiquity Edgerouter Lite: well... my edge router. v1.7.0. In charge of inter VLAN talking. Connected to WAN.
TP Link TL-SG2424: my access switch, where the VLANs and trunks are defined.
4 ESXi 6.5: 2 clusters of 2. Xeons E5 2670 with 64GB RAM and Xeons X3363 with 16GB RAM. Everything on 6.5. 4 pNICs each, 2 for iSCSI.
Not hardware but every VM is behind a Distributed Virtual Switch.

Logical Network:
VLAN10: All of my Servers. They don't have access to Internet.
VLAN20: My desktops. Full access.
VLAN11/12: iSCSI VLANs.
VLAN40: pfSense VLAN for LAN: Guests VMs (VDIs), Guest WiFi. Everything that I want to isolate from my main Network has to talk to pfSense first.
VLAN50: pfSense WAN network -> connects to router (which is another firewall). So, LAN and WAN on pfSense are actually both LAN.

What I wan't to achieve is, pfSense managing my Guest network AND as a Front End of my servers (reverse proxy) and a proxy for my Update Managers and servers that needs to contact the internet (giving inbound and outbound traffic to VLAN10 passing through pfSense).

It's working. I have Squid reverse proxy, I can connect to my VMware Horizon Lab, to my NextCloud, webmail, etc. The edgerouter points to pfsense, pfsense works as a frontend, and talks to VLAN10 backends. Everything works! Problem? When I download anything I get ~150KB/s max.

I tried several things, convinced that the problem was the lack of my routing skills... to no avail. Disabled firewalls in between. Nothing. Then I thought... let's make it simple. Connected a VM behind a newly installed pfSense which is doing nothing but giving internet access on VLAN40 and that VM has 5mbps upload (40/40mbps connection).

Conclusion: there is an issue, and I can replicate it on a smaller configuration. So I'm open to suggestions...

Offline dotdash

  • Hero Member
  • *****
  • Posts: 1848
  • Karma: +88/-0
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #1 on: June 16, 2017, 10:21:53 am »
It's not a limitation with ESXi. I've got a pfSense VM with stock em interfaces that does over 500 Mb both ways.
Mine is a simple setup- ESXi is trunked to a switch, one vlan is the LAN, another is WAN- coming in off the switch to the provider. No proxies or other routers in the mix- I'd guess your problem lies there. For reference, VM has 1 CPU, 4 GB, x64. ESXi is 6.0 running on server grade hardware.

Offline agbiront

  • Newbie
  • *
  • Posts: 7
  • Karma: +0/-0
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #2 on: June 16, 2017, 12:01:42 pm »
Inter VLAN speed is at Link Speed, 1gbps up and down. The Ubiquity router is quite capable of routing ~1gpbs FD.

VLAN20 Server - VLAN10 Client TCP:

[SUM]   0.00-10.00  sec  1.06 GBytes   909 Mbits/sec                  sender
[SUM]   0.00-10.00  sec  1.06 GBytes   909 Mbits/sec                  receiver

pfSense Server - VLAN20 Client:

Server listening on TCP port 5001
TCP window size: 63.7 KByte (default)
------------------------------------------------------------
[  4] local 192.168.40.3 port 5001 connected with 192.168.20.2 port 56582
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec   920 MBytes   772 Mbits/sec


VLAN20 Server - pfSense Client:

------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  208 KByte (default)
------------------------------------------------------------
[  4] local 192.168.20.2 port 5001 connected with 192.168.40.3 port 36320
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.9 sec  2.00 MBytes  1.54 Mbits/sec


So, when pfSense sends data it crawls. This is newly installed.

I have white boxes... not server grade. But I don't think it's a hardware issue. pfSense has 2C/4GB. Tried 1C. Zero usage.

Trace:

pathping pfsense01t

Tracing route to pfsense01t.lab.tst [192.168.40.3]
over a maximum of 30 hops:
  0  Haswell.lab.tst [192.168.20.2]
  1  edge01.lab.tst [192.168.20.1]
  2  pfsense01t.lab.tst [192.168.40.3]

Traceback:

 traceroute 192.168.20.2
traceroute to 192.168.20.2 (192.168.20.2), 64 hops max, 40 byte packets
 1  192.168.40.2 (192.168.40.2)  0.387 ms  0.339 ms  0.337 ms
 2  haswell (192.168.20.2)  0.549 ms *  0.612 ms


There's a dropped packet!  :o



*******************
EDIT
*******************

On the same ESXi Server


VM on VLAN40 Client - pfSense VLAN40 Server

------------------------------------------------------------
Client connecting to 192.168.40.3, TCP port 5001
TCP window size:  208 KByte (default)
------------------------------------------------------------
[  3] local 192.168.40.100 port 65097 connected with 192.168.40.3 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  2.34 GBytes  2.01 Gbits/sec

pfSense VLAN40 Client - VM on VLAN40 Server

------------------------------------------------------------
Client connecting to 192.168.40.100, TCP port 5001
TCP window size: 65.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.40.3 port 15470 connected with 192.168.40.100 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  5.73 GBytes  4.92 Gbits/sec

*****

Now the same but with different ESXi hosts. Now communications have to go to the NIC and the TPLink Switch.

pfSense Server - VM VLAN40 Client:

[  4] local 192.168.40.3 port 5001 connected with 192.168.40.200 port 52995
[  4]  0.0-10.0 sec   233 MBytes   195 Mbits/sec

VM VLAN40 Server - pfSense Client:

------------------------------------------------------------
Client connecting to 192.168.40.200, TCP port 5001
TCP window size: 65.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.40.3 port 38347 connected with 192.168.40.200 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   641 MBytes   537 Mbits/sec


Conclusions (kinda):
When pfSense has to actually use the physical NIC, bandwidth is affected. The Edgerouter makes matters even worse. But VLAN to VLAN on my setup is working a 900Mbps between VMs and desktops. pfSense VM does not like something on my network! And vNIC performance is sub-par other guests OS
« Last Edit: June 16, 2017, 12:38:06 pm by agbiront »

Offline agbiront

  • Newbie
  • *
  • Posts: 7
  • Karma: +0/-0
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #3 on: June 16, 2017, 04:42:25 pm »
I'm not a "networking guy", this is beyond my skill set, please bear with me. But I like troubleshooting things, and this is proving to be... entertaining.

At least this posts serves me as a captains log  :P

I fired up Wireshark, and started to capture packets listening on port 5001, to isolate the iperf traffic and to try and figure something out.

First, I set the pfsense as server, and uploaded at 758mbps, to have an example of a "good" TCP conversation. I immediately noticed some duplicated ACKs here and there... But I can't tell if that's common or not.

Ok, so then I proceeded to capture with the pfsense uploading data to mi Desktop set up as server... and it's a mess.

I got TCP Dup ACKs all over, and TCP Fast Retransmission here and there. But something caught my eye: packet length. ALL packets are ~1514 bytes, but on the other capture I have 21954 bytes and even bigger!

 :o

Now what?
« Last Edit: June 16, 2017, 04:45:55 pm by agbiront »

Offline lexcomman

  • Newbie
  • *
  • Posts: 4
  • Karma: +0/-0
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #4 on: June 16, 2017, 11:43:14 pm »
I am in a similar situation running pfSense 2.3.4 release in VMWare. I have narrowed my connection speed limitation down to the pfSense vm. Iperf from other vm's to physical hosts and reverse all have 900+ Mbps through put both ways on the same VLAN ID. Iperf from physical machine to local VLAN IP of pfSense VM get 900+ Mbps through put. Once pfSense has to route to a different VLAN I drop to ~500Mbps through put. This is the same result if I use the E1000 or VMX3 nic for the pfSense VM. I'm running dell server hardware, and the pfSense VM is quad core with 4GB of RAM no snort or other modules installed. Even destroyed the VM and rebuilt thinking there was corruption somewhere. I will do a packet capture and see if I get the same issue of ACK packets.

Offline johnpoz

  • Hero Member
  • *****
  • Posts: 12675
  • Karma: +1105/-108
  • Not a pfSense employee, they cannot fire me...
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #5 on: June 17, 2017, 05:14:28 am »
"Once pfSense has to route to a different VLAN I drop to ~500Mbps through put"

Are you hairpinning those intervlan tests?  Why are you testing TO pfsense when you should be testing through pfsense..

"pfSense Server - VM VLAN40 Client:"

Sounds like pfsense is the endpoint in this iperf test.. Your also using a small 65k window size..
- An intelligent man is sometimes forced to be drunk to spend time with his fools.
- If I have helped you and want to help back, https://www.freebsdfoundation.org/donate/
- Please don't PM me for personal help, info you don't want public sure. Link to thread you would like me to look at ok, etc.
1x SG-2440 2.3.3_p1 (work)
1x 2.4.0-BETA Jun 22 03:42:40 VM running on esxi 6.5 (home)

Offline lexcomman

  • Newbie
  • *
  • Posts: 4
  • Karma: +0/-0
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #6 on: June 17, 2017, 06:55:40 am »
For me I was testing from from a physical client to the closets end point working my way out to the furthest end point to see where my breakage was happening. I then did the same testing from a virtual machine on the same ESX host to see where the breakage was at. At 1st I was not sure if it was VMWare network configuration, switch configuration , or pfSense causing my problem. By doing this it narrowed it straight down to pfSense being the problem child. I did other testing as well from VM to VM on the ESX host.

Offline johnpoz

  • Hero Member
  • *****
  • Posts: 12675
  • Karma: +1105/-108
  • Not a pfSense employee, they cannot fire me...
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #7 on: June 17, 2017, 07:20:02 am »
And how are you connecting between vlans.. I take it when you say vlan you have your vlan tag sitting on physical interface..  So your in out the same physical for traffic between the vlans.  That is a hairpin - and yes you just cut the bandwidth in half...

You need to be clear on how your testing...

Are you on like the top connection with no hairpin, or more like the bottom with hairpin when devices talking to each other?  Without understanding how you have this all connected together my guess is yes your doing a hairpin, and yes when you use vlan tags and share the same equipment for traffic flowing between these vlans..
- An intelligent man is sometimes forced to be drunk to spend time with his fools.
- If I have helped you and want to help back, https://www.freebsdfoundation.org/donate/
- Please don't PM me for personal help, info you don't want public sure. Link to thread you would like me to look at ok, etc.
1x SG-2440 2.3.3_p1 (work)
1x 2.4.0-BETA Jun 22 03:42:40 VM running on esxi 6.5 (home)

Offline agbiront

  • Newbie
  • *
  • Posts: 7
  • Karma: +0/-0
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #8 on: June 17, 2017, 08:21:06 am »
I don't think we are having the same issue. We both see a penalty when pfSense has to route, but you have 500mbps, and my conection drops from 750mbps to 1.5mbps.

I know the TCP window was lower (65KBytes vs 208KBytes). My Ubiquiti router can route at 1gbps with 65KBytes, anyway. The problem is, with a Window Size of 64KBytes, as I understand, the packets should be up to 65KBytes of size?

As I said before, I captured the packets:

To pfSense from Desktop @ 750Mbps = Packets bigger than 30KBytes
To Desktop from pfSense @ 1.5Mbps = All packets are 1514 Bytes.


Offline lexcomman

  • Newbie
  • *
  • Posts: 4
  • Karma: +0/-0
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #9 on: June 17, 2017, 08:34:38 am »
Thanks for the follow ups btw. Below is more of  how my setup works sorry its abit rough I was throwing it together while drinking morning coffee. I can go from Physical -> VM1 without network performance drop. I can go from Physical -> pfSense local gateway without performance drop. When I go from Physical -> pfSense vlan10 or vlan 12 gateway address I experience performance drop. When I go from Physical -> VM2 I experience performance drop. I can swap all this testing by using VM1 as my source and get the same results. All my indicators are pointing to pfSense limiting my bandwidth somehow. I have trashed my original VM and started from scratch with a vanilla install with the same results. I have followed the network optimization thread but that has not changed my results either.

ESX Host Specs:
Dual Xeon Quad Core 2.4Ghz
16GB RAM
8 x 10K SAS drives in Raid 10
Dual onboard Broadcom Gigabit NICs.

Offline agbiront

  • Newbie
  • *
  • Posts: 7
  • Karma: +0/-0
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #10 on: June 17, 2017, 08:52:36 am »
Why are you using portchannel? Could you try disabling it? Route based on virtual port ID should be enough to load balance 2 1gbps pNICs. LACP/PortChannel/ChannelBonding/whatever only gives headaches...

--------------

Packet segmentation happens on layer 3, am I right? Because in my environment, when pfSense talks to a client on his VLAN (no routing) performance is Okayish. But when pfSense has to route to another VLAN, performance drops like a brick to 1.5mbps

In my capture, it's evident that packet are being segmented. iperf running on pfSense sends 65KB packets, but on the interface I only see 1514 Bytes.

Is it possible to completely disable packet segmentation? If it's needed, the Ubiquiti router should be in charge.


Edit: several edits. I need my morning coffee.

Offline johnpoz

  • Hero Member
  • *****
  • Posts: 12675
  • Karma: +1105/-108
  • Not a pfSense employee, they cannot fire me...
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #11 on: June 17, 2017, 09:27:12 am »
"To pfSense from Desktop @ 750Mbps = Packets bigger than 30KBytes
To Desktop from pfSense @ 1.5Mbps = All packets are 1514 Bytes."

What part did you not understand about not testing to pfsense as the endpoint?  Pfsense is NOT a file server its a router - if you want to test the performance of it routing/firewall/natting/etc it will all be THROUGH pfsense not too it.

As to your load sharing/port channel - kind of utterly pointless unless you have LOTS of clients talking to lots of servers - any single cllient talking to any single server is going to go through the same interface.  One thing I will agree with is yes that sort of setup normally makes it way more complex in troubleshooting bandwidth issues.
- An intelligent man is sometimes forced to be drunk to spend time with his fools.
- If I have helped you and want to help back, https://www.freebsdfoundation.org/donate/
- Please don't PM me for personal help, info you don't want public sure. Link to thread you would like me to look at ok, etc.
1x SG-2440 2.3.3_p1 (work)
1x 2.4.0-BETA Jun 22 03:42:40 VM running on esxi 6.5 (home)

Offline lexcomman

  • Newbie
  • *
  • Posts: 4
  • Karma: +0/-0
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #12 on: June 17, 2017, 09:38:53 am »
For me this is a home test lab that I just mess around and learn with. While I would normally agree about the port channel and I didn't mention this before was that I had taken ESX down to a single trunked port connection. This didn't change my through put results either sadly. I did test to the pfSense interfaces but mine was simply for testing each hop connection in the chain. In my testing it always suffered performance when pfSense began to route the traffic.

Offline agbiront

  • Newbie
  • *
  • Posts: 7
  • Karma: +0/-0
    • View Profile
Re: Another thread about low bandwidth with VMware ESXi
« Reply #13 on: June 17, 2017, 10:21:46 am »
"To pfSense from Desktop @ 750Mbps = Packets bigger than 30KBytes
To Desktop from pfSense @ 1.5Mbps = All packets are 1514 Bytes."

What part did you not understand about not testing to pfsense as the endpoint?  Pfsense is NOT a file server its a router - if you want to test the performance of it routing/firewall/natting/etc it will all be THROUGH pfsense not too it.


Ok. Care to explain the logic behind that statement? I'm not testing against a VM behind pfSense just to narrow down the issue. I don't see how testing against pfSense can be an issue at all.

Do you REALLY need me to make that test and show you the exact same results?

When pfSense has to ROUTE traffic, that's it... from one of my VLANs to another, packet size is 1514bytes and bandwidth is 1.5Mbps.
When pfSense is at Layer 2, that's pfSense on the same VLAN as the other VM, packet size is whatever the application want's it to be and bandwidth is 500+mbps


EDIT:

Just to prove a point:

------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  208 KByte (default)
------------------------------------------------------------
[  4] local 192.168.20.2 port 5001 connected with 192.168.20.217 port 49772
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.09 GBytes   934 Mbits/sec
[  4] local 192.168.20.2 port 5001 connected with 192.168.40.202 port 49818
[  4]  0.0-11.9 sec  5.50 MBytes  3.89 Mbits/sec

192.168.40.202 is a VM behind pfSense. Exact same issue as pfSense. When it has to SEND data to the server on VLAN20 it crawls at 3.89mbps

BUT

[  3] local 192.168.20.2 port 52252 connected with 192.168.40.202 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   945 MBytes   792 Mbits/sec

it RECEIVES data at 792mbps

Packet captured by Wireshark:

From VLAN20 (my Desktop) to VLAN40 (pfSense): 800Mbps ~30KBytes per data packet
From VLAN40 (pfSense) to VLAN20 (Desktop): 4Mbps 1514bytes per packet

Clear and concise question:

How can I completely disable packet segmentation on pfSense?
Why is pfSense segmenting packets when my router is not?

MTU on everything is 1500.
« Last Edit: June 17, 2017, 10:27:54 am by agbiront »