The pfSense Store

Author Topic: High Availability HA authentication failure  (Read 285 times)

0 Members and 1 Guest are viewing this topic.

Offline andipandi

  • Jr. Member
  • **
  • Posts: 56
  • Karma: +0/-1
    • View Profile
High Availability HA authentication failure
« on: October 29, 2017, 09:57:18 am »
I had HA set up, settings synced for a while, but some time ago this got lost, only noticed recently.
Might have been the 2.4.0 update or mocking around with firewall or other settings.. from what I can see now, firewall is not the problem though, since in the local network, everything is permitted.

Primary is on 2.4.0, secondary on 2.3.3 - I keep secondary for primary reboots, but also since IPSEC changes sometimes seem to break functionality, and since they are a nightmare to debug, to keep a known version. Primary is on 192.168.0.2, secondary on 192.168.0.3, they have a shared CARP IP on 192.168.0.1 that should not matter in this scenario.

So.. what I get on primary is

Quote
An authentication failure occurred while trying to access https://192.168.0.3:443/xmlrpc.php (host_firmware_version). @ 2017-10-29 15:21:14

in notification bubble

and in system logs

Quote
Oct 29 15:21:13 php-fpm[74235]: /rc.filter_synchronize: New alert found: An authentication failure occurred while trying to access https://192.168.0.3:443/xmlrpc.php (host_firmware_version).
Oct 29 15:21:13 php-fpm[74235]: /rc.filter_synchronize: An authentication failure occurred while trying to access https://192.168.0.3:443/xmlrpc.php (host_firmware_version).
Oct 29 15:21:13 php-fpm[74235]: /rc.filter_synchronize: Beginning XMLRPC sync data to https://192.168.0.3:443/xmlrpc.php.

On the secondary, I get in the system logs

Quote
Oct 29 15:21:15  sshlockout  71691  Locking out 192.168.0.2 after 15 invalid attempts 
Oct 29 15:21:15  php-fpm  79982  /xmlrpc.php: webConfigurator authentication error for 'admin' from 192.168.0.2 
Oct 29 15:21:15  php-fpm  79982  /xmlrpc.php: webConfigurator authentication error for 'admin' from 192.168.0.2 during sync settings. 

also, main IP gets locked out now.

I changed passwords on admin account a few times, kept it the same in user manager and in HA settings. Also, the communication seems to be working at least partially, otherwise I would not get the error and the secondary logs.

Offline dotdash

  • Hero Member
  • *****
  • Posts: 1915
  • Karma: +99/-2
    • View Profile
Re: High Availability HA authentication failure
« Reply #1 on: October 30, 2017, 05:34:01 pm »
Both boxes have to be running the same version to sync. Upgrade the secondary and it should work.

Offline andipandi

  • Jr. Member
  • **
  • Posts: 56
  • Karma: +0/-1
    • View Profile
Re: High Availability HA authentication failure
« Reply #2 on: October 31, 2017, 06:05:04 pm »
Thanks.
Seems there are other issues involved as well.
https://forum.pfsense.org/index.php?topic=137572.0
I regret doing the update... thought it would be more stable..

Offline Derelict

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 9088
  • Karma: +1037/-306
    • View Profile
Re: High Availability HA authentication failure
« Reply #3 on: November 01, 2017, 02:08:10 am »
That is an edge case and probably unrelated to what you are seeing. It is plainly telling you the reason for the failure is the firmware version mismatch. You cannot XMLRPC sync from mismatched versions because the configuration file format might have changed.

Your HA nodes need to be on the same version after you have upgraded the secondary, failed over to it, and determined it worked. The mismatch period should not be long or extended, but just long enough to determine everything is fine.

If not you need to fail back and rebuild the secondary on the old version so they match again.
« Last Edit: November 01, 2017, 02:11:25 am by Derelict »
Las Vegas, Nevada, USA
Use this diagram to describe your issue.
The pfSense Book is now available for just $24.70!
Do Not PM For Help! NO_WAN_EGRESSTM

Offline andipandi

  • Jr. Member
  • **
  • Posts: 56
  • Karma: +0/-1
    • View Profile
Re: High Availability HA authentication failure
« Reply #4 on: November 02, 2017, 02:51:13 am »
Thank you!

Can I downgrade (you call it rebuild?) without issues? Is there a guide on how to do it?

I think I have upgraded to 3.4 too early and see some issues now in a system I am using in production.

BTW, sync between different 3.3 subversions was working fine.

Offline Derelict

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 9088
  • Karma: +1037/-306
    • View Profile
Re: High Availability HA authentication failure
« Reply #5 on: November 02, 2017, 04:35:58 am »
Since you did it backwards (upgrading the primary first) the easiest way to get it back is to fail over to the secondary, reinstall 2.3.3 on the primary, and restore the 2.3.3 configuration backup you took before you upgraded.
Las Vegas, Nevada, USA
Use this diagram to describe your issue.
The pfSense Book is now available for just $24.70!
Do Not PM For Help! NO_WAN_EGRESSTM

Offline hbc

  • Jr. Member
  • **
  • Posts: 40
  • Karma: +1/-0
    • View Profile
Re: High Availability HA authentication failure
« Reply #6 on: November 02, 2017, 04:51:28 am »
Hi,

I did it the other way. First upgraded secondary, tested a week and then upgraded primary. Both nodes have same firmware 2.4.1 now, but HA-Sync still fails.
Is there a chance to get them in sync again without fresh install? I don't wanna lose my squid logs.

Thanks

Offline Derelict

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 9088
  • Karma: +1037/-306
    • View Profile
Re: High Availability HA authentication failure
« Reply #7 on: November 02, 2017, 06:39:37 am »
Quote
Both nodes have same firmware 2.4.1 now, but HA-Sync still fails.
Fails with what log messages?

Quote
I don't wanna lose my squid logs.
That's why you don't use a firewall as a log-retention device.
Las Vegas, Nevada, USA
Use this diagram to describe your issue.
The pfSense Book is now available for just $24.70!
Do Not PM For Help! NO_WAN_EGRESSTM

Offline hbc

  • Jr. Member
  • **
  • Posts: 40
  • Karma: +1/-0
    • View Profile
Re: High Availability HA authentication failure
« Reply #8 on: November 02, 2017, 07:58:24 am »
Quote
Fails with what log messages?

Code: [Select]
A communications error occurred while attempting to call XMLRPC method filter_configure:
Code: [Select]
A communications error occurred while attempting to call XMLRPC method restore_config_section:
Code: [Select]
A communications error occurred while attempting to call XMLRPC method exec_php:
Code: [Select]
A communications error occurred while attempting to call XMLRPC method host_firmware_version:
At the moment I consider 2.4.x firmware and HA as broken. I get more and more issues. The sync is just one issue. So I will have to downgrade to 2.3.x again.
  • After upgrade, I can only access firewall when stopping firewall 'service pf onestop', but when looking into web interface, my rules allowing me access to machine are still present.
  • The defined gateways are pingable from firewall cli, but in web interface, they are displayed as down. Only the IPv6 gateway is shown as up

The secondary firewall with 2.4.1 same/sync'ed configuration works, but since primary machine having issues with gateways and access to interface, I do not dare to leave maintenance mode afraid of crashing the network and that fallback to secondary works again. So I will rebuild primary with backup configuration .

Quote
That's why you don't use a firewall as a log-retention device.
I just need them on machine for lightsquid reports.

Edit:
Made backups and made fresh install of 2.4.1. At least I have access again to firewall via webinterface, but sync is pretty special. When I force sync via [Status -> Filter reload --> Force config sync] it works. But when doing changes like rules or other actions that are synced in backround, I get the messages like shown on top.
At the same time, I get
Code: [Select]
/xmlrpc.php: ERROR! Either LDAP search failed, or multiple users were found. on the secondary machine. This must be the sync process, since nobody else logged into this machine during my restore.

I use LDAP for user authentication to firewall itself and a local user for sync. Seems that the background sync uses the default server defined in [System-->User Manager-->Settings-->Authentication Server] to lookup sync user, while the forced sync uses the local database. Maybe I have to create a LDAP user account for the syncing? I would prefer a fallback to local server instead.
« Last Edit: November 02, 2017, 11:15:10 am by hbc »

Offline Derelict

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 9088
  • Karma: +1037/-306
    • View Profile
Re: High Availability HA authentication failure
« Reply #9 on: November 03, 2017, 02:22:56 pm »
Hmm. I have never done an HA pair with an LDAP-configured authentication backend for the webgui (which will also be xmlrpc sync.)

Later versions (including 2.4.X) fixed the long-standing issue of being unable to specify the xmlrpc username and password.

It might be worth creating a local user on the primary, which should sync to the secondary, that specifically includes the System - HA node sync permission then specifying that user on the primary in the XMLRPC settings.

The secondary is the one that is controlling where things are authenticated. Are you certain the user being specified is present there? Does the XMLRPC sync user and password pass on the secondary in Diagnostics > Authentication? Is there any significant delay? Are the Authentication servers specified identical on the primary and the secondary? Do both nodes pass Diagnostics > Authentication?
Las Vegas, Nevada, USA
Use this diagram to describe your issue.
The pfSense Book is now available for just $24.70!
Do Not PM For Help! NO_WAN_EGRESSTM