boppy

boppy · Dec 4, 2023

Das ist auch eine der Abfragen, die ich im Script nicht verstehe oder nachvollziehen kann. Ich überarbeite aktuell umfangreich das Script für die Backup und Restore Funktion. Allerdings brauche ich noch mindestens 1 Wochenende, bis es einen vorzeigbaren Stand hat.

Ich empfehle dir in diesem Fall, dass du die Prüfung einfach deaktivierst. Es wäre ausreichend, wenn du in der Datei backup_and_restore.sh eine Zeile auskommentierst: mailcow/mailcow-dockerized/blob/master/helper-scripts/backup_and_restore.sh#L47 - einfach in der ersten Spalte ein # ergänzen. Dies gibt dann immernoch die Fehlermeldung aus, aber es bricht die Ausführung des Scripts nicht ab.

boppy · Nov 26, 2023

Okay, that discussion didn’t quite work out.

As there seemed not to be that much interest in the b’n’r script, I contacted the server cow team about my findings and ideas. I’m now in the process of implementing a new backup and restore script. Further discussion () will take place in a github issue I open once there is a version available to look at.

boppy · Nov 25, 2023

boohoomoo

I do have hourly backups of the entire VM running Mailcow. So the mails should still be “somewhere”. My other question, therefore, is where can I find the actual mails (the “maildir”?) in my VM in Docker so I know where to look in the backup?

The volume your mails reside in is (on the default config) mail-vmail-vol-1. If you run a default configured docker on Debian (or Ubuntu), the volume should be in /var/lib/docker/volumes/mail-vmail-vol-1/_data.

You can find the actions performed in a mailbox by checking the logs of mail-dovecot-mailcow-1 (inside the mailcow-folder: docker-compose logs dovecot-mailcow). You’ll see lines like that:

Delete a message

mail-dovecot-mailcow-1  | Nov 25 12:09:48 c4e857f4d6b7 dovecot: imap(henning@example.com)<69859><h/ZmG/gKHqasFgH4>: copy from INBOX: box=Trash, uid=2244, msgid=<EG.1183.190739.93803.9236d320@example.com>, size=367553
mail-dovecot-mailcow-1  | Nov 25 12:09:48 c4e857f4d6b7 dovecot: imap(henning@example.com)<69859><h/ZmG/gKHqasFgH4>: delete: box=INBOX, uid=13708, msgid=<EG.1183.190739.93803.9236d320@example.com>, size=367553
# ------------------------------------------------------------------| IMAP USERNAME     |---------------------------------------------------------------| MESSAGE ID                              |---------------------

Move message to a folder

mail-dovecot-mailcow-1  | Nov 25 12:11:57 c4e857f4d6b7 dovecot: imap(henning@example.com)<70016><kUQcI/gKxumsFgH4>: copy from INBOX: box=INBOX/Test-Folder, uid=1, msgid=<EG.1224.190740.93803.d25fb00e@example.com>, size=259925
# ------------------------------------------------------------------| IMAP USERNAME     |------------------------------------| SRC |-----| TARGET         |--------------| MESSAGE ID                              |-------------
mail-dovecot-mailcow-1  | Nov 25 12:11:57 c4e857f4d6b7 dovecot: imap(henning@example.com)<70016><kUQcI/gKxumsFgH4>: expunge: box=INBOX, uid=13707, msgid=<EG.1224.190740.93803.d25fb00e@example.com>, size=259925

remove a folder

mail-dovecot-mailcow-1  | Nov 25 12:12:48 c4e857f4d6b7 dovecot: imap(henning@example.com)<70067><5aopJvgK+sysFgH4>: Mailbox renamed: INBOX/Test-Folder -> Trash/Test-Folder
# ------------------------------------------------------------------| IMAP USERNAME     |-------------------------------------------| SRC             |--| TARGET          |

To limit the output, the docker-compose logs command also allows arguments:

      --since string    Show logs since timestamp (e.g. 2013-01-02T13:23:37Z) or relative (e.g. 42m for 42 minutes)
      --until string    Show logs before a timestamp (e.g. 2013-01-02T13:23:37Z) or relative (e.g. 42m for 42 minutes)

So you can just review a timeframe the action should have happened:

docker-compose logs --since '2023-11-01T00:00:00Z' --until '2023-11-14T23:59:59Z' dovecot-mailcow

The limitation here is, that the logs are not stored outside docker (as far as I can see), so you loose your logs if you update mailcow and it rebuilds the dovecot container.

boppy · Nov 19, 2023

Not that it actually helps much, but I am using the built in ACME-Container and do not have the same problem:

mail-acme-mailcow-1  | Thu Nov 16 15:08:49 CET 2023 - Using existing domain rsa key /var/lib/acme/acme/key.pem
mail-acme-mailcow-1  | Thu Nov 16 15:08:49 CET 2023 - Using existing Lets Encrypt account key /var/lib/acme/acme/account.pem
mail-acme-mailcow-1  | Thu Nov 16 15:08:49 CET 2023 - Detecting IP addresses...
mail-acme-mailcow-1  | Thu Nov 16 15:08:49 CET 2023 - OK: 65.108.x.y, 2a01:4f9:x:y::1
mail-acme-mailcow-1  | Thu Nov 16 15:08:50 CET 2023 - Validated CAA for parent domain example.com
mail-acme-mailcow-1  | Thu Nov 16 15:08:50 CET 2023 - Found AAAA record for mail.example.com: 2a01:4f9:x:y::1 - skipping A record check
mail-acme-mailcow-1  | Thu Nov 16 15:08:50 CET 2023 - Confirmed AAAA record with IP 2a01:04f9:x:y:0000:0000:0000:0001
mail-acme-mailcow-1  | Thu Nov 16 15:08:50 CET 2023 - Certificate /var/lib/acme/mail.example.com/cert.pem validation done, neither changed nor due for renewal.
mail-acme-mailcow-1  | Thu Nov 16 15:08:50 CET 2023 - Certificates were successfully validated, no changes or renewals required, sleeping for another day.

But: The log output on my instance does not once show a domain after the timestamp. I assume that I’m on an old version of the acme container. My installation is up-to-date, but the first installation took place in Aug. 2022 - since my backup-and-restore-script also differs from the current one in the git, there might also be a difference in the acme container…

boppy · Nov 14, 2023

Github and ipv6 is indeed a pain in the bottom… If you have any infrastructure that talks both, you could build yourself a proxy using nginx.

Config (in a stream {} context) would be as simple as

server {
    listen [::]:2000; # only allow ipv6
    proxy_pass 140.82.121.4:22; # forward to ipv4 of github.com
}

Then add a global URI replacement for git:

git config --global url."ssh://git@githubv6.example.com:2000/".insteadOf "git@github.com:"

(where githubv6.example.com has an AAAA-Record for your IPv6…)

I can confirm that working. You can also replace the domain name in the copied clone-string before each clone…

boppy · Nov 11, 2023

Good moo-ning everybody!

I feel like there are some issues with the backup and restore script

GitHub

mailcow-dockerized/helper-scripts/backup_and_restore.sh at master · mailcow/mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.

of mailcow. As I cannot find how to start a discussion on github, I’ll post it here ^^

`d-c` vs `d c`

After going from docker-compose to docker compose some of the internal templates for names change (a personal contender for the TOP10 ideas of the year ). This change is only partially mirrored in the script.

The queries done with docker volume ls ( see 1

GitHub

mailcow-dockerized/helper-scripts/backup_and_restore.sh at a366494c3492157947b494d89c7839bfd1c8f09e · mailcow/mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.

and >= 12 other places) only scan for docker compose volumes (underscore: <project>_<name>) and not docker-compose(dash: <project>-<name>) ones. There should be a switch. Running instances seem to be fine (ie: mine). But starting with a fresh clone will break things.

Side-quest: Documentation

Docs

docs.mailcow.email

Information & Support

None

list volumes in docker-compose way. Also the statement “take good care of these volumes” is weird to me, because clamd-db-vol-1, sogo-userdata-backup-vol-1, sogo-web-vol-1, and vmail-index-vol-1 are not included in the backup script at all (mysql-socket-vol-1 also isn’t, but that makes sense).

Leaving the side-quest, I feel like the volumes sogo-userdata-backup-vol-1 and sogo-web-vol-1 should be included in the backup-and-restore, because they include user config (sogo-userdata-backup-vol-1) and possible changes to SoGO (sogo-web-vol-1). For me, the clamdb and mail index files can be left out in the b-n-r script.

Network on backup

mailcow-backup joins the docker default network and not the compose network (exception: see 2

GitHub

mailcow-dockerized/helper-scripts/backup_and_restore.sh at a366494c3492157947b494d89c7839bfd1c8f09e · mailcow/mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.

). Depending on the config of the host, this might even be considered a security problem (I only realized that because I started monitoring the default net, because I don’t run anything on it). Except for

2

GitHub

mailcow-dockerized/helper-scripts/backup_and_restore.sh at a366494c3492157947b494d89c7839bfd1c8f09e · mailcow/mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.

I think no network is needed. So --network none might be a good addition of the docker run commands (if kept as-is).

Possible race condition on restore

I cannot fully confirm due to not knowing the insides of mailcow too well, but the restore process always just stops the container it is currently working on. I feel like a race condition can occur, when the first services are up again, but other services just receive their data. Primary concern is the DB host that is the last one that is restored (using all), but postfix and dovecot are already up again at that time.

Code quality and optimizations

There are some issues, including splitting issues with variables that are used as directory names

GitHub

mailcow-dockerized/helper-scripts/backup_and_restore.sh at a366494c3492157947b494d89c7839bfd1c8f09e · mailcow/mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.

, comparing

lowercase’d strings against uppercase chars

GitHub

mailcow-dockerized/helper-scripts/backup_and_restore.sh at a366494c3492157947b494d89c7839bfd1c8f09e · mailcow/mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.

,

1-item-loop

GitHub

mailcow-dockerized/helper-scripts/backup_and_restore.sh at a366494c3492157947b494d89c7839bfd1c8f09e · mailcow/mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.

, indentation issues, and not using stderr for error outputs.

Also, I think that it would be an option to move much of the process to run inside the backup container with a little helper script. That could help with the race condition mentioned above, but also make it easier to streamline the process. I imagine a helper script that, for the backup part, takes the args just as the current script does, but optimizes the backup, like

CMPS_SPLIT="_"
function backup_docker() {
    docker run --name mailcow-backup --rm \
        --network $(docker network ls -qf name=^${CMPS_PRJ}${CMPS_SPLIT}mailcow-network$) \
        -v ${BACKUP_LOCATION}/mailcow-${DATE}:/backup:z \
        -v $(docker volume ls -qf name=^${CMPS_PRJ}${CMPS_SPLIT}vmail-vol-1$):/vmail:ro,z \
        -v $(docker volume ls -qf name=^${CMPS_PRJ}${CMPS_SPLIT}crypt-vol-1$):/crypt:ro,z \
        -v $(docker volume ls -qf name=^${CMPS_PRJ}${CMPS_SPLIT}redis-vol-1$):/redis:ro,z \
        -v $(docker volume ls -qf name=^${CMPS_PRJ}${CMPS_SPLIT}rspamd-vol-1$):/rspamd:ro,z \
        -v $(docker volume ls -qf name=^${CMPS_PRJ}${CMPS_SPLIT}postfix-vol-1$):/postfix:ro,z \
        ${DEBIAN_DOCKER_IMAGE} /bin/run-backup.sh $1
}

function backup() {

  RUN_BACKUPS=()
  while (( "$#" )); do
    case "$1" in
    vmail|all)
      RUN_BACKUPS+=( "vmail" )
      ;;&
    crypt|all)
      RUN_BACKUPS+=( "crypt" )
      ;;&
    redis|all)
      RUN_BACKUPS+=( "redis" )
      ;;&
    rspamd|all)
      RUN_BACKUPS+=( "rspamd" )
      ;;&
    postfix|all)
      RUN_BACKUPS+=( "postfix" )
      ;;&
    esac
    shift
  done

  if [[ "${#RUN_BACKUPS[@]}" -gt 0 ]]; then
    backup_docker "$RUN_BACKUPS[@]"
  fi
}

As mentioned, the restore process is always “just” working on one job. Since pigz is not really good in multithreaded decompression

GitHub

Decompressing with pigz is not parallel · Issue #36 · madler/pigz

If I decompress a file with pigz -d foo.gz it only uses one core, on my multicore system. Compressing uses all my cores, but decompressing does not. Is this by design? I'm trying to compare decompr...

, multiple files could be decompressed in parallel to reduce runtime.

MySQL Backup

The DB backup part scans for mysql and mariadb in the compose file

GitHub

mailcow-dockerized/helper-scripts/backup_and_restore.sh at a366494c3492157947b494d89c7839bfd1c8f09e · mailcow/mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.

, but uses mariabackup for every operation on the found container. This tool is not available in mysql images - I checked mysql:8 and assume mysqldump would be a good option to go with.

Also, I see that a --prepare is done after each backup ( see 4

GitHub

mailcow-dockerized/helper-scripts/backup_and_restore.sh at a366494c3492157947b494d89c7839bfd1c8f09e · mailcow/mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.

). Does this make sense AFTER a backup, but not BEFORE a restore? The maria docs are vague on that (“If you try to restore the database without first preparing the data, InnoDB rejects the new data as corrupt.”), but I feel that the backup is prepared using the available DB as reference. I could totally be wrong here (usually I am

), since I did not use MariaDB for > 8y now… Also, this preparation, the owner change, and the compression are done, even if the backup process fails.

The delete-days logic’s find can also be optimized not to use any calc (-mtime) and can make use of the -delete option to reduce risk of word splitting. Possibly resulting in find "${BACKUP_LOCATION}/mailcow-"* -maxdepth 0 -mtime +${1} -delete - but -delete can only handle files and empty dirs, so for older backups the db dumps are not removed, so an -exec rm might still be needed fro 100% backwards compatibility. The DB backup is also the only one not using pigz for zipping.

—

conclusion

I would offer to refactor/rewrite the backup and restore script after agreeing with you moo’ists, if all (or at least some, please ) of my findings are correct. Because the script would run though a complete rewrite, I am hesitant to provide a PR before discussion…

And sorry for the long post.

boppy

Delete a message

Move message to a folder

remove a folder

d-c vs d c

Side-quest: Documentation

Network on backup

Possible race condition on restore

Code quality and optimizations

MySQL Backup

conclusion

`d-c` vs `d c`