Ok guys, long update here. I have been doing lots of testing.
TL;DR — there is a bug in the version of busybox udhcpc (1.19.4) compiled into Unifi hardware that causes this. When the search domain begins with a numeric character, udhcpc barfs on it and sets the search domain to "bad" in /etc/resolv.conf (screenshot below). This causes the device to fail to register with the controller. The bug was
fixed a year-and-a-half ago so I don't know why we are stuck with such an ancient version, but I opened a
post over on the UBNT forums and am awaiting a response. For now the only simple fix that works in
all test cases below is to use workaround #2 or #3 below. If you care, you can read more...
There are 3 known workarounds:
1. Don't use a system domain that starts with a number (may or may not be an option for you)
2. Use my patch (for 2.4b use
this commit, for 2.3.3 use
this commit) and tick the "add unqualified (short) hostname" checkbox on your unifi.
whatever host override
PR#35993. In Custom Options of DNS Resolver, add a line e.g.
local-data: "unifi A 4.5.6.7"
I built up a mini lab with a fresh install of 2.3.3 on APU2 hardware. Bog-standard out of the box config, I was focusing solely on isolating this issue and reproducing it. I believe I have uncovered something strange.
The ingredients for the test were:
• unprovisioned Unifi Access Point - running latest stable firmware which at the time was 3.7.40.6115
*• pfSense CE 2.3.3 on APU2 hardware - clean install
• 2 interfaces configured - WAN/LAN
• Unifi WAP plugged directly into LAN interface (POE injector)
• DNS handled by Unbound - default config options [Transparent/DNSSEC enabled]
• Single Host Override defined "unifi.
system-domain" pointing to imaginary controller 1.2.3.4
• system-domain was alternated between `36hudson.lan` and `hudson36.lan`
• also tested all of the above again with PR#3599 installed, with and without enabling the "add unqualified hostname" option
Steps1. boot Unifi WAP from powered-off state
2. once it's booted, ssh in and run
cat /etc/resolv.conf
ping unifi (if that fails, ping unifi.fqdn)
nslookup unifi (if fail, nslookup unifi.fqdn)
The test results are below. While it's not a pfSense-specific issue, I believe that my patch handles this problem cleanly, and due to the popularity of Unifi + pfSense, it would be helpful to have it in there. It does fix the issue for me and until Ubiquiti resolves the matter, at least we have an easy & consistent way to patch any affected systems. I do have some .pcap packet captures if anyone needs those, but honestly once I found the issue it was pretty easy to reproduce and after inspecting them in Wireshark, I don't think this bug is a result of any malformed requests or responses on the wire.


____________
*also tested with 2 older firmwares [3.4.14.3413, 3.7.39.6089] -- same results