Dear Community,

I’m using MailCow on a Ubuntu 2204 VM, running on a Synology DS 920+, for some time now, and I’m usually happy and find my way around (also thanks to this great community)… until I ran into the following problem this morning: the Dovecot container stopped, probably triggered by Watchdog, since all the other containers stopped as well.

The log shows:
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T02:00:01.634938320Z 2023-10-06 04:00:01,579 WARN received SIGTERM indicating exit request
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T02:00:01.748452680Z 2023-10-06 04:00:01,661 INFO waiting for processes, dovecot, syslog-ng to die
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T02:00:01.748516194Z Oct 6 04:00:01 65901a56010a syslog-ng[118]: syslog-ng shutting down; version='3.28.1'
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T02:00:01.916570784Z 2023-10-06 04:00:01,892 INFO stopped: syslog-ng (exit status 0)
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T02:00:03.220295971Z 2023-10-06 04:00:03,219 WARN received SIGQUIT indicating exit request
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T02:00:03.332303127Z 2023-10-06 04:00:03,330 INFO stopped: dovecot (exit status 0)
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T02:00:03.332442180Z 2023-10-06 04:00:03,331 INFO reaped unknown pid 127 (exit status 0)
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T02:00:03.380773089Z 2023-10-06 04:00:03,373 INFO stopped: processes (terminated by SIGTERM)

The container then remained stopped until I started it manually this morning without issues, it is working fine since then:
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:37.082443505Z Uptime: 14160 Threads: 13 Questions: 19609 Slow queries: 0 Opens: 63 Open tables: 54 Queries per second avg: 1.384
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:38.563639985Z The user
vmail’ is already a member of tty'.
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:39.120033501Z % Total % Received % Xferd Average Speed Time Time Time Current
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:39.121424397Z Dload Upload Total Spent Left Speed
100 112k 100 112k 0 0 513k 0 --:--:-- --:--:-- --:--:-- 513k
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:39.369412501Z 20_blatspammer.cf
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:39.369582748Z 70_HS_body.cf
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:39.372495628Z 70_HS_header.cf
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:40.037083765Z 2023-10-06 07:56:40,036 INFO Set uid to user 0 succeeded
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:40.047497739Z 2023-10-06 07:56:40,045 INFO supervisord started with pid 1
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:41.062878900Z 2023-10-06 07:56:41,049 INFO spawned: 'processes' with pid 117
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:41.062940148Z 2023-10-06 07:56:41,054 INFO spawned: 'dovecot' with pid 118
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:41.071614843Z 2023-10-06 07:56:41,068 INFO spawned: 'syslog-ng' with pid 119
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:41.243220478Z [2023-10-06T07:56:41.242412] WARNING: With use-dns(no), dns-cache() will be forced to 'no' too!;
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:41.247488372Z Oct 6 07:56:41 65901a56010a syslog-ng[119]: syslog-ng starting up; version='3.28.1'
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:41.812114744Z Oct 6 07:56:41 65901a56010a dovecot: doveadm(ysup0v5rwpnropyg@mailcow.local): Error: User doesn't exist
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:42.814420462Z 2023-10-06 07:56:42,813 INFO success: processes entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:42.814466043Z 2023-10-06 07:56:42,814 INFO success: dovecot entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
mailcowdockerized-dovecot-mailcow-1 | 2023-10-06T05:56:42.814479736Z 2023-10-06 07:56:42,814 INFO success: syslog-ng entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

When I read through the logs of this morning, I see some postfix issues, which seemingly resolved themselves, since the postfix container was up and running.
I attached the log as of today 04h - 05h local time.

The only thing happened around this time was the nighty renewal of my external IPv4/v6 addresses of my DSL connection.

It’s not too big of a deal to restart the container… if I’m at home (which often is not the case). So I would like to understand why the container didn’t come back up, i.e. where / in which log files can I find the root cause. Or how to avoid this issue altogether - can I safely disable watchdog in the conf file and run any system updates manually ?

Many thanks for your support and best regards,
Stefan

log-231006-02.txt
1MB
  • esackbauer replied to this.
  • mailcowdockerized-watchdog-mailcow-1 | 2023-10-06T02:18:30.016767651Z CRITICAL - Socket timeout

    Find out why the host is losing network connectivity. You have a lot of them in your watchdog log, I don’t have a single one in several weeks.

    Nullmeridian The only thing happened around this time was the nighty renewal of my external IPv4/v6 addresses of my DSL connection.

    What? You are using a dynamic IP address? For a mail server??

    Have something to say?

    Join the community by quickly registering to participate in this discussion. We'd like to see you joining our great moo-community!

    Hi, yes I do, updating the IP address via an update script using “qmcgaw/ddns-updater”. No issues so far. For sending, I’m using a mail relay email address of my provider. Works like a charm.
    Any idea why the dovecot container didn’t come back ? Or at least where I could look at ?

    As it seems there are no errors in dovecot, I guess something else is happening on that server, so that the watchdog thinks dovecot is unresponsive and tries a clean shutdown. Maybe you are running backup at that time? Or other things eating CPU or causing high I/O load?
    Check the watchdog logs.

    I checked all logs (I attached the complete log as of 4am in my first post). Can’t find any obvious error message… I would have expected some logging from dovecot or from watchdog.

    Re memory: I assigned 7 GB, which is seemingly enough for a couple of mailboxes in 3 domains.
    (MiB Mem : 6929.5 total, 642.6 free, 3795.5 used, 2491.5 buff/cache)

    No other tasks running (my backup tasks start at 5am).

    OK, it seems I’m the only one with this issue, and to be honest, it occurred only once since I switched from the Synology proprietary solution and started running mailcow some years ago, so let’s close this thread… thanks for your reply.

    Have a good evening,
    Stefan

    mailcowdockerized-watchdog-mailcow-1 | 2023-10-06T02:18:30.016767651Z CRITICAL - Socket timeout

    Find out why the host is losing network connectivity. You have a lot of them in your watchdog log, I don’t have a single one in several weeks.

    Hi, I checked both the Synology (host of my VM) and the VM and couldn’t find any network related issues. Also, why would only dovecot be affected and none of the other containers, which all reported health = 100%.
    The syno is directly attached to the switch / router, which both run stable.
    I disabled watchdog (use watchdog = n) and for the moment, all is working fine and probably will be working fine for the upcoming years :-) thanks again for this great solution!

    No one is typing