Recently I've been planing to update our dovecot installation and migrate from maildir do dbox witch raised the question, should we go with sdbox or with mdbox?
I've been reading some threads on the dovecot mailing list and compiled this list of questions and answers to help in the decision, all answers are from Timo Sirainen:
1. What is the advantage to using multiple files?
A: mdbox in theory uses less disk I/O for "normal users".
2. What is the advantage to using a single sdbox file for each user?
A: It's simpler. More difficult to get corrupted. Also if in future there exists a filesystem that supports smaller files better, it's then faster than mdbox. Probably unlikely that it will happen anytime soon.
3. Is this a binary format, or txt (UTF?)?
A: dbox headers/metadata is ASCII. The message bodies can of course be anything.
4. Are there real-world benchmarks showing measurable differences between maildir, sdbox, mdbox?
A: Not that I'm aware of. So far everyone I've tried to ask have replaced their whole mail system and their storage, so the before/after numbers can't be compared. I'm very interested in knowing myself too.
5. Are sdbox & mdbox equally stable to Maildir? Are they recommended for production systems?
A: sdbox is so simple that I doubt anyone will find any kind of corruption bugs. mdbox is more complex, but people are using it in production and I haven't heard of any problems recently. Although there have been bugs in how mdbox handles already corrupted files, v2.0.10 had several fixes related to that.
6. In mdbox we should not use a ramdisk for indexes. But what about sdbox? sdbox indexes work as maildir indexes? Are sdbox indexes bigger than maildir indexes?
A: If this is a heavy use box, having everyone's indexes being rebuilt at the same time could bring it to its knees...
Since this is a server I'm sure you have adequate power protection (UPS), so only extended power outages might be an issue - but then you should also have it configured to safely shut down in this event, no?
But anyway, yes, the indexes will be rebuilt and everything should continue working...
7. One of the main advantages (speed wise) of dbox over maildir is that index files are the only storage for message flags and keywords. What happens when we want to recover some messages from backup? With maildir we can rebuild message indexes, but I am not sure about dbox. Should we also restore "old indexes" and merge with the "new indexes" in order to restore the deleted messages?
A: The intended way to restore stuff is to either restore the entire dbox to a temp directory, or at least all the important parts of it (indexes + the files that contain the wanted mails) and then use something like:
doveadm import sdbox:/tmp/restoredbox "" saved since 2011-01-01
8. The previous question applies to sdbox and mdbox. In the case of mdbox, we can configure rotation of files using /mdbox_rotate_size/ . We would like to rotate daily, not based in size (our users ask us for yesterday's backup). How can we accomplish this?
A: mdbox_rotate_interval = 1d
But note that that doesn't guarantee that there will be only one file. Even if you set mdbox_rotate_size to 10 GB or something (or I think 0 makes it unlimited, not sure), it's possible that two files will be created if mails are being saved at the same time. mdbox never waits for locks when writing to a file, instead it'll just use another file or create a new one.
Anyway, if it's not a big deal restoring the user's entire mailbox temporarily you can restore only yesterday's mails by giving proper search query parameter to doveadm import.
9. We have now 17.000.000 messages in our maildir, almost 1.5 TB (zlib compresssion enabled). Our backup time with bacula is rather bad: 24 hours for a full backup, most of the time the backup is busy fstat'ing all those little messages.
A: In case of Maildir there's no point in fstating any mail files. I'd guess it should be possible to patch bacula to not do that.
10. We think that mdbox can help us in this. Does anybody has good experiences migrating from maildir->mdox in "large" enviroments? What about mdox performance & reliability?
A: I haven't recently heard of corruption complaints about mdbox.. Previously when there were those, I didn't hear of complains about losing mails or anything, so that's good :)
Someone's experience from around 2011-03, I believe that dsync has improved by now:
- Sdbox is using far too much I/O on a busy server, I had to switch to mdbox. sdbox is not sustainable when having very large mailbox, IO becomes too high (even with high-end storage devices)
- Timo said that sdbox is not expected to have more I/O than maildir.
- Mdbox is running well so far, and resources (IO or CPU) are not an issue anymore.
- Converting from Maildir to s/mdbox is easy
- Converting from sdbox to mdbox has been a complete nightmare. I have never managed to make it completely, finally made it through imap protocol between 2 instance of dovecot. You better choose before sd or md, but not try to convert between the 2. Dsync is too buggy to convert sdbox to mdbox. The only solution I found was to use IMAP protocol to read from sdbox and write as mdbox.