container unbound unhealthy

Bb1pum · Nov 28, 2023

I have the same problem. Reinstalled VM with Debian.
Some Docker containers don’t start because unbound is a requirement and unbound is started unhealthy.
The ./healthcheck.sh in the container says:
Maybe check your outbound firewall, as it needs to resolve DNS over TCP AND UDP!

nslookup mailcow.email 127.0.0.1 
;; connection timed out; no servers could be reached

./docker-entrypoint.sh 
Setting console permissions...
Receiving anchor key...
Receiving root hints...
############################################################# 100.0%
setup in directory /etc/unbound
removing artifacts
Setup success. Certificates created. Enable in unbound.conf file to use

GGiga · Jan 22, 2024

I had the same issue. Searching my firewall-logs revealed, that mailcow (probably the unbound container) uses PING to check, if the public DNS resolver is reachable (in my case 1.1.1.1). This makes no sense for me. Instead, trying to resolve some names would make more sense, if you want to check if DNS works…

I usually only allow what is really necessary. I didn’t see any necessity to allow pings. I’ve just allowed Pings, now everything works. The unbound container switched to healthy again.

DocFraggle · Jan 22, 2024

Giga it seems that the ping check (and a new netcat check) was only introduced last week by @DerLinkman in mailcow/mailcow-dockerizedb29dc37
And I agree that an ICMP check ~~as well as an HTTP/HTTPS check~~ for the unbound container doesn’t really make sense… unbound needs to process DNS queries via DNS, afaik it doesn’t ping any server ~~or uses HTTP/HTTPS~~

EDIT: OK, HTTPS checks indeed make sense if DNS over HTTPS is used

Uuniquegch · Feb 2, 2024

as permanent situation the container unbound is always unhealthy. but it seems not to have an impact. Where can I find more details. because trying to use docker compose logs container name is not working

DocFraggle · Feb 2, 2024

uniquegch

docker compose logs unbound-mailcow

Uuniquegch · Feb 2, 2024

unbound-mailcow-1 | setup in directory /etc/unbound
unbound-mailcow-1 | removing artifacts
unbound-mailcow-1 | Setup success. Certificates created. Enable in unbound.conf file to use
unbound-mailcow-1 | [1706863286] unbound[1:0] notice: init module 0: validator
unbound-mailcow-1 | [1706863286] unbound[1:0] notice: init module 1: iterator
unbound-mailcow-1 | [1706863286] unbound[1:0] info: start of service (unbound 1.17.1).
unbound-mailcow-1 | [1706863293] unbound[1:0] info: generate keytag query _ta-4f66. NULL IN

seems to look ok

DocFraggle · Feb 2, 2024

You can connect to the unbound container and run the healthcheck.sh script to see what it’s complaining about:

docker compose exec unbound-mailcow /bin/bash 

728992fc9ee9:/# ./healthcheck.sh 
PING 1.1.1.1 (1.1.1.1): 56 data bytes

--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 5.749/5.795/5.857 ms
PING 8.8.8.8 (8.8.8.8): 56 data bytes

--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 5.185/5.252/5.361 ms
PING 9.9.9.9 (9.9.9.9): 56 data bytes

--- 9.9.9.9 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 11.493/11.595/11.658 ms

Uuniquegch · Feb 2, 2024

thanks @DocFraggle did that and had to unblock another IP@ range.
but still now looking into that again after 1 hour the container is still unhealthy. even that I restarted it. I will do the healthcheck again over the weekend.

KKaiserN · Feb 2, 2024

maybe stop fail2ban and start again with a basic fw ruleset if you have connection problems somehow.
(to investigate)

Uuniquegch · Feb 2, 2024

well the healtcheck is ok
34bea2964d85:/# ./healthcheck.sh
PING 1.1.1.1 (1.1.1.1): 56 data bytes

— 1.1.1.1 ping statistics —
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 1.367/1.429/1.521 ms
PING 8.8.8.8 (8.8.8.8): 56 data bytes

— 8.8.8.8 ping statistics —
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 1.452/1.472/1.501 ms
PING 9.9.9.9 (9.9.9.9): 56 data bytes

— 9.9.9.9 ping statistics —
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 26.597/26.694/26.792 ms

DocFraggle · Feb 2, 2024

Have a look at /var/log/healthcheck.log inside the unbound container

pozzo-balbi · Feb 4, 2024

Hi, same problem here. I think one cause for confusion is the healthcheck script inside the container and docker calling the container “healthy” or not. This is what I get after docker up -d (PS: I use podman with docker-compose standalone inside a VM behind NAT and NAT6, heathcheck.sh runs without any problems.)

 ✘ Container mailcowdockerized-unbound-mailcow-1          Error                                                                                    3.1s 
[..]
dependency failed to start: container mailcowdockerized-unbound-mailcow-1 had unexpected health status ""

If you restart unbound-mailcow manual, postfix will not be running. I played a little bit around and using version 2023-05 had no problems whatsoever. Running docker-compose up unbound-mailcow -d ; docker-compose up -d on version 2023-10a yield this:

  Container mailcowdockerized-unbound-mailcow-1    Healthy                                                                                                    0.0s

Without first starting unbound-mailcow in version 2023-10a, I get this:

 ✘ Container mailcowdockerized-unbound-mailcow-1          Error                                                                                                3.0s

So my assumption is, that docker at startup checks if unbound-mailcow changes “started” to “healthy” status. If this does not happend within short time at boot, the container is switched from docker to error status even though you can connect to unbound, healthcheck.sh works find and unbound is resolving dns just fine.

I changed back to the latest mailcow and changed in docker-compose.yml the “condition: service_healthy” to “condition: service_started” everywhere I found it and now it runs without any problems. Unbound says at startup:

  Container mailcowdockerized-unbound-mailcow-1          Started                                                                                              3.0s

I hope this helps finding a cure to this problem. thanks

tardar · Feb 9, 2024

@pozzo-balbi may you try to enter into the unbound container with:

docker compose exec unbound-mailcow /bin/bash

and then try to lookup any domain ?
by: nslookup mailcow.email for example ?
Are you able to resolve it ?
Do you have a firewall in front of your mailserver ?
If ye make sure that your mailserver is able to contact to the dns server you use (in your firewall or your own - not an external).

Please tell u more

PS: I fixed same problem because there was a firewall rule missing but i don’t know why that worked before ^^

DocFraggle · Feb 9, 2024

tardar PS: I fixed same problem because there was a firewall rule missing but i don’t know why that worked before ^^

That’s because the ping check in the health script was introduced with 2024-01… it wasn’t there in earlier releases and IMHO isn’t necessary at all. If Google decides to block incoming ICMP requests to their DNS servers one day a lot of Mailcows will go down at once…

pozzo-balbi · Feb 10, 2024

@tardar The healthcheck script is never run but if I run it manually, it says that there are not network issues. IPv6, http, https, icmp and dns and the rest working. Changing the variable in mailcow.conf did not help. But changing “healthy” to “started” in docker-compose.yml fixed the problem.

lipunis · Feb 13, 2024

It’s also the same problem. I tried what was indicated here. The error remains

DocFraggle · Feb 13, 2024

Just to clarify: you checked if the healthcheck.sh script is running without errors from within the unbound container?

cd /opt/mailcow-dockerized
docker compose exec unbound-mailcow /bin/bash

5d9850aa648f:/# ./healthcheck.sh 
PING 1.1.1.1 (1.1.1.1): 56 data bytes

--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 5.714/5.755/5.826 ms
PING 8.8.8.8 (8.8.8.8): 56 data bytes

--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 5.311/5.421/5.540 ms
PING 9.9.9.9 (9.9.9.9): 56 data bytes

--- 9.9.9.9 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 11.484/11.526/11.585 ms

5d9850aa648f:/# tail /var/log/healthcheck.log 
2024-02-13 09:14:08: Healthcheck: ALL CHECKS WERE SUCCESSFUL! Unbound is healthy!

lipunis · Feb 13, 2024

yes. 0 packets is lost

DocFraggle · Feb 13, 2024

lipunis and the file /var/log/healthcheck.log states that everything is OK?

5d9850aa648f:/# tail /var/log/healthcheck.log 
2024-02-13 09:14:08: Healthcheck: ALL CHECKS WERE SUCCESSFUL! Unbound is healthy!