boppy

  • Jan 12, 2024
  • Joined Nov 11, 2023
  • 1 discussion
  • 6 posts
  • 0 best answers
  • Post posted... wait what? You got likes!
  • 💬 🇩🇪 🇬🇧 | Loony but lucky! | Developer with love and joy.

    • boppy

        Moolevel 1

      Das ist auch eine der Abfragen, die ich im Script nicht verstehe oder nachvollziehen kann. Ich überarbeite aktuell umfangreich das Script für die Backup und Restore Funktion. Allerdings brauche ich noch mindestens 1 Wochenende, bis es einen vorzeigbaren Stand hat.

      Ich empfehle dir in diesem Fall, dass du die Prüfung einfach deaktivierst. Es wäre ausreichend, wenn du in der Datei backup_and_restore.sh eine Zeile auskommentierst: mailcow/mailcow-dockerized/blob/master/helper-scripts/backup_and_restore.sh#L47 - einfach in der ersten Spalte ein # ergänzen. Dies gibt dann immernoch die Fehlermeldung aus, aber es bricht die Ausführung des Scripts nicht ab.

    • Okay, that discussion didn’t quite work out. 😭

      As there seemed not to be that much interest in the b’n’r script, I contacted the server cow team about my findings and ideas. I’m now in the process of implementing a new backup and restore script. Further discussion (🫣) will take place in a github issue I open once there is a version available to look at.

    • boohoomoo

      I do have hourly backups of the entire VM running Mailcow. So the mails should still be “somewhere”. My other question, therefore, is where can I find the actual mails (the “maildir”?) in my VM in Docker so I know where to look in the backup?

      The volume your mails reside in is (on the default config) mail-vmail-vol-1. If you run a default configured docker on Debian (or Ubuntu), the volume should be in /var/lib/docker/volumes/mail-vmail-vol-1/_data.

      You can find the actions performed in a mailbox by checking the logs of mail-dovecot-mailcow-1 (inside the mailcow-folder: docker-compose logs dovecot-mailcow). You’ll see lines like that:

      Delete a message

      mail-dovecot-mailcow-1  | Nov 25 12:09:48 c4e857f4d6b7 dovecot: imap(henning@example.com)<69859><h/ZmG/gKHqasFgH4>: copy from INBOX: box=Trash, uid=2244, msgid=<EG.1183.190739.93803.9236d320@example.com>, size=367553
      mail-dovecot-mailcow-1  | Nov 25 12:09:48 c4e857f4d6b7 dovecot: imap(henning@example.com)<69859><h/ZmG/gKHqasFgH4>: delete: box=INBOX, uid=13708, msgid=<EG.1183.190739.93803.9236d320@example.com>, size=367553
      # ------------------------------------------------------------------| IMAP USERNAME     |---------------------------------------------------------------| MESSAGE ID                              |---------------------

      Move message to a folder

      mail-dovecot-mailcow-1  | Nov 25 12:11:57 c4e857f4d6b7 dovecot: imap(henning@example.com)<70016><kUQcI/gKxumsFgH4>: copy from INBOX: box=INBOX/Test-Folder, uid=1, msgid=<EG.1224.190740.93803.d25fb00e@example.com>, size=259925
      # ------------------------------------------------------------------| IMAP USERNAME     |------------------------------------| SRC |-----| TARGET         |--------------| MESSAGE ID                              |-------------
      mail-dovecot-mailcow-1  | Nov 25 12:11:57 c4e857f4d6b7 dovecot: imap(henning@example.com)<70016><kUQcI/gKxumsFgH4>: expunge: box=INBOX, uid=13707, msgid=<EG.1224.190740.93803.d25fb00e@example.com>, size=259925

      remove a folder

      mail-dovecot-mailcow-1  | Nov 25 12:12:48 c4e857f4d6b7 dovecot: imap(henning@example.com)<70067><5aopJvgK+sysFgH4>: Mailbox renamed: INBOX/Test-Folder -> Trash/Test-Folder
      # ------------------------------------------------------------------| IMAP USERNAME     |-------------------------------------------| SRC             |--| TARGET          |

      To limit the output, the docker-compose logs command also allows arguments:

            --since string    Show logs since timestamp (e.g. 2013-01-02T13:23:37Z) or relative (e.g. 42m for 42 minutes)
            --until string    Show logs before a timestamp (e.g. 2013-01-02T13:23:37Z) or relative (e.g. 42m for 42 minutes)

      So you can just review a timeframe the action should have happened:

      docker-compose logs --since '2023-11-01T00:00:00Z' --until '2023-11-14T23:59:59Z' dovecot-mailcow

      The limitation here is, that the logs are not stored outside docker (as far as I can see), so you loose your logs if you update mailcow and it rebuilds the dovecot container.

    • Not that it actually helps much, but I am using the built in ACME-Container and do not have the same problem:

      mail-acme-mailcow-1  | Thu Nov 16 15:08:49 CET 2023 - Using existing domain rsa key /var/lib/acme/acme/key.pem
      mail-acme-mailcow-1  | Thu Nov 16 15:08:49 CET 2023 - Using existing Lets Encrypt account key /var/lib/acme/acme/account.pem
      mail-acme-mailcow-1  | Thu Nov 16 15:08:49 CET 2023 - Detecting IP addresses...
      mail-acme-mailcow-1  | Thu Nov 16 15:08:49 CET 2023 - OK: 65.108.x.y, 2a01:4f9:x:y::1
      mail-acme-mailcow-1  | Thu Nov 16 15:08:50 CET 2023 - Validated CAA for parent domain example.com
      mail-acme-mailcow-1  | Thu Nov 16 15:08:50 CET 2023 - Found AAAA record for mail.example.com: 2a01:4f9:x:y::1 - skipping A record check
      mail-acme-mailcow-1  | Thu Nov 16 15:08:50 CET 2023 - Confirmed AAAA record with IP 2a01:04f9:x:y:0000:0000:0000:0001
      mail-acme-mailcow-1  | Thu Nov 16 15:08:50 CET 2023 - Certificate /var/lib/acme/mail.example.com/cert.pem validation done, neither changed nor due for renewal.
      mail-acme-mailcow-1  | Thu Nov 16 15:08:50 CET 2023 - Certificates were successfully validated, no changes or renewals required, sleeping for another day.

      But: The log output on my instance does not once show a domain after the timestamp. I assume that I’m on an old version of the acme container. My installation is up-to-date, but the first installation took place in Aug. 2022 - since my backup-and-restore-script also differs from the current one in the git, there might also be a difference in the acme container…

      • Github and ipv6 is indeed a pain in the bottom… If you have any infrastructure that talks both, you could build yourself a proxy using nginx.

        Config (in a stream {} context) would be as simple as

        server {
            listen [::]:2000; # only allow ipv6
            proxy_pass 140.82.121.4:22; # forward to ipv4 of github.com
        }

        Then add a global URI replacement for git:

        git config --global url."ssh://git@githubv6.example.com:2000/".insteadOf "git@github.com:"

        (where githubv6.example.com has an AAAA-Record for your IPv6…)

        I can confirm that working. You can also replace the domain name in the copied clone-string before each clone…

      • Good moo-ning everybody! 🐮

        I feel like there are some issues with the GitHub Icon backup and restore script

        of mailcow. As I cannot find how to start a discussion on github, I’ll post it here ^^

        d-c vs d c

        After going from docker-compose to docker compose some of the internal templates for names change (a personal contender for the TOP10 ideas of the year 😶‍🌫️). This change is only partially mirrored in the script.

        The queries done with docker volume ls (GitHub Icon see 1

        and >= 12 other places) only scan for docker compose volumes (underscore: <project>_<name>) and not docker-compose(dash: <project>-<name>) ones. There should be a switch. Running instances seem to be fine (ie: mine). But starting with a fresh clone will break things.

        Side-quest: Documentation

        docs.mailcow.email Icon Docs

        list volumes in docker-compose way. Also the statement “take good care of these volumes” is weird to me, because clamd-db-vol-1, sogo-userdata-backup-vol-1, sogo-web-vol-1, and vmail-index-vol-1 are not included in the backup script at all (mysql-socket-vol-1 also isn’t, but that makes sense).

        Leaving the side-quest, I feel like the volumes sogo-userdata-backup-vol-1 and sogo-web-vol-1 should be included in the backup-and-restore, because they include user config (sogo-userdata-backup-vol-1) and possible changes to SoGO (sogo-web-vol-1). For me, the clamdb and mail index files can be left out in the b-n-r script.

        Network on backup

        mailcow-backup joins the docker default network and not the compose network (exception: GitHub Icon see 2

        ). Depending on the config of the host, this might even be considered a security problem (I only realized that because I started monitoring the default net, because I don’t run anything on it). Except for GitHub Icon 2
        GitHub Icon GitHub
        mailcow-dockerized/helper-scripts/backup_and_restore.sh at a366494c3492157947b494d89c7839bfd1c8f09e · mailcow/mailcow-dockerized
        mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.
        mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.
        I think no network is needed. So --network none might be a good addition of the docker run commands (if kept as-is).

        Possible race condition on restore

        I cannot fully confirm due to not knowing the insides of mailcow too well, but the restore process always just stops the container it is currently working on. I feel like a race condition can occur, when the first services are up again, but other services just receive their data. Primary concern is the DB host that is the last one that is restored (using all), but postfix and dovecot are already up again at that time.

        Code quality and optimizations

        There are some issues, including splitting issues with variables that are GitHub Icon used as directory names

        , comparing GitHub Icon lowercase’d strings against uppercase chars
        GitHub Icon GitHub
        mailcow-dockerized/helper-scripts/backup_and_restore.sh at a366494c3492157947b494d89c7839bfd1c8f09e · mailcow/mailcow-dockerized
        mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.
        mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.
        , GitHub Icon 1-item-loop
        GitHub Icon GitHub
        mailcow-dockerized/helper-scripts/backup_and_restore.sh at a366494c3492157947b494d89c7839bfd1c8f09e · mailcow/mailcow-dockerized
        mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.
        mailcow: dockerized - 🐮 + 🐋 = 💕. Contribute to mailcow/mailcow-dockerized development by creating an account on GitHub.
        , indentation issues, and not using stderr for error outputs.

        Also, I think that it would be an option to move much of the process to run inside the backup container with a little helper script. That could help with the race condition mentioned above, but also make it easier to streamline the process. I imagine a helper script that, for the backup part, takes the args just as the current script does, but optimizes the backup, like

        CMPS_SPLIT="_"
        function backup_docker() {
            docker run --name mailcow-backup --rm \
                --network $(docker network ls -qf name=^${CMPS_PRJ}${CMPS_SPLIT}mailcow-network$) \
                -v ${BACKUP_LOCATION}/mailcow-${DATE}:/backup:z \
                -v $(docker volume ls -qf name=^${CMPS_PRJ}${CMPS_SPLIT}vmail-vol-1$):/vmail:ro,z \
                -v $(docker volume ls -qf name=^${CMPS_PRJ}${CMPS_SPLIT}crypt-vol-1$):/crypt:ro,z \
                -v $(docker volume ls -qf name=^${CMPS_PRJ}${CMPS_SPLIT}redis-vol-1$):/redis:ro,z \
                -v $(docker volume ls -qf name=^${CMPS_PRJ}${CMPS_SPLIT}rspamd-vol-1$):/rspamd:ro,z \
                -v $(docker volume ls -qf name=^${CMPS_PRJ}${CMPS_SPLIT}postfix-vol-1$):/postfix:ro,z \
                ${DEBIAN_DOCKER_IMAGE} /bin/run-backup.sh $1
        }
        
        function backup() {
        
          RUN_BACKUPS=()
          while (( "$#" )); do
            case "$1" in
            vmail|all)
              RUN_BACKUPS+=( "vmail" )
              ;;&
            crypt|all)
              RUN_BACKUPS+=( "crypt" )
              ;;&
            redis|all)
              RUN_BACKUPS+=( "redis" )
              ;;&
            rspamd|all)
              RUN_BACKUPS+=( "rspamd" )
              ;;&
            postfix|all)
              RUN_BACKUPS+=( "postfix" )
              ;;&
            esac
            shift
          done
        
          if [[ "${#RUN_BACKUPS[@]}" -gt 0 ]]; then
            backup_docker "$RUN_BACKUPS[@]"
          fi
        }

        As mentioned, the restore process is always “just” working on one job. Since pigz is not really good in GitHub Icon multithreaded decompression

        , multiple files could be decompressed in parallel to reduce runtime.

        MySQL Backup

        The DB backup part scans for mysql and mariadb in GitHub Icon the compose file

        , but uses mariabackup for every operation on the found container. This tool is not available in mysql images - I checked mysql:8 and assume mysqldump would be a good option to go with.

        Also, I see that a --prepare is done after each backup (GitHub Icon see 4

        ). Does this make sense AFTER a backup, but not BEFORE a restore? The maria docs are vague on that (“If you try to restore the database without first preparing the data, InnoDB rejects the new data as corrupt.”), but I feel that the backup is prepared using the available DB as reference. I could totally be wrong here (usually I am 🤷‍♂️), since I did not use MariaDB for > 8y now… Also, this preparation, the owner change, and the compression are done, even if the backup process fails.

        The delete-days logic’s find can also be optimized not to use any calc (-mtime) and can make use of the -delete option to reduce risk of word splitting. Possibly resulting in find "${BACKUP_LOCATION}/mailcow-"* -maxdepth 0 -mtime +${1} -delete - but -delete can only handle files and empty dirs, so for older backups the db dumps are not removed, so an -exec rm might still be needed fro 100% backwards compatibility. The DB backup is also the only one not using pigz for zipping.

        conclusion

        I would offer to refactor/rewrite the backup and restore script after agreeing with you moo’ists, if all (or at least some, please 🤓) of my findings are correct. Because the script would run though a complete rewrite, I am hesitant to provide a PR before discussion…

        And sorry for the long post. 🤷‍♂️