From Cyrus

Jump to: navigation,

Low Bandwidth Replication

[Page Tools]

Executive Summary

  • Current Cyrus replication has bandwidth requirements for any mailbox change which are proportional to the number of messages in the mailbox rather than to the amount of actual change. This becomes worse when CONDSTORE is enabled, because even per-user \Seen changes require a mailbox sync.
  • While this isn't an issue within a single datacentre with his speed links, it becomes an issue for remote replicas.
  • COMPRESS=DEFLATE support has already been added to CVS for the sync protocol, but that's not a complete solution, just something that will be handy anyway.

Pre-conditions

  • Merge Index
  • modseq calculation always on, i.e. condstore always supported.

Index Format Changes

a new field LAST_EXPIRE_TIME in the cyrus.index header. This field is required to compress the index file at both ends of a replication pair and retain the same records.

Implementation

sync protocol workings:

> MAILBOXES shared.notices user.foo.bar user.xyz user.xyz.Trash
< * shared.notices <uniqueid> <highestmodseq> <lastuid> <header_crc> <last_expire>
< * user.foo.bar <uniqueid> <highestmodseq> <lastuid> <header_crc> <last_expire>
< * user.xyz <uniqueid> <highestmodseq> <lastuid> <header_crc> <last_expire>
< * user.xyz.Trash <uniqueid> <highestmodseq> <lastuid> <header_crc> <last_expire>
< OK MAILBOXES completed

From this we determine that user.xyz was unmodified, user.foo.bar has the same lastuid, but a lower highestmodseq on the server, and user.xyz.Trash has had a message appended (highestmodseq is lower and lastuid is lower). Finally, shared.notices has a different last_expire time.

Given that highestmodseq, lastuid and header_crc are IDENTICAL, we are safe in assuming that no changes need to be made to user.xyz. Even if by some rare clash there had been conflicting changes made at both ends and they had both caused identical CRCs right through the chain, they are very likely to cause a clash next change when the CRC calculations have to be redone, at which point a full sync will be done on that mailbox. Again, only occurs after a split brain.

So - user.foo.bar: we read the client cyrus.index and determine which records have a higher modseq than server->highestmodseq.

> RECORDS user.foo.bar <offset> <index record> 0 <offset> <index record> 0 [...]
< * user.foo.bar <uniqueid> <highestmodseq> <lastuid> <header_crc> <last_expire>
< OK RECORDS completed

At this point, highestmodseq, lastuid, header_crc and last_expire ALL MATCH. We have successfully updated user.foo.bar with bandwidth use proportional to the number of changed records, not the total number.

The '0' means "don't need to copy the message file from stage".

What's left? user.xyz.Trash. lastuid is lower, so there are appends, as well as potential flag changes.

> RESERVE <guid1> <guid2> [...]
< * RESERVE <guid1>
< OK RESERVE completed

guid1 was found on the server in one of the mailboxes mentioned during this sync run. guid2 wasn't found, hence no RESERVE record.

> UPLOAD <guid2> {size+}
> ...
< * RESERVE <guid2>
< OK UPLOAD completed

Now the server has copies of both messages staged.

> RECORDS user.foo.bar <offset> <index record> 1 <offset> <index record> 1 [...]
< * user.foo.bar <uniqueid> <highestmodseq> <lastuid> <header_crc> <last_expire>
< OK RECORDS completed

Just like "RECORDS", but with the added proviso that the server knows to copy the message files from the stage because we passed a '1'. (exact implementation details might change of course... the server could even determine this itself based on being past mailbox->num_records)

Note the response was a mailbox state statement, which again matches. Fantastic, we know that was everything. Of course, we would have issued RECORDS lines for any non-append record with a higher modseq as well if necessary.

Note that I missed one. shared.notices. The last_expire timestamp didn't match. We read the client's last_expire and issue:

> EXPIRE shared.notices <at_timestamp> <expire_days> 
< * shared.notices <uniqueid> <highestmodseq> <lastuid> <header_crc> <last_expire>
< OK EXPIRE completed

If they now match, good. Otherwise, steps as above. In a pathological case, the last_expire on the server is NEWER. You need to run the expire on the client with that timestamp in this case to get them in sync. By passing "expire_days" we ensure both ends are using the same expire policy. The expire run should only ever happen on the master end with this setup, or should be run "at the same time"[tm] on both ends. The easiest way to do this is to run cyr_expire with a fixed time quantization, and have it run with the same base time at each end.

I think that's everything. Oh, I didn't explain changes in cyrus.header. Basically, you would have to fetch the userflags string from the server and compare to the client. If no clashing names, make them identical, otherwise you'd need to do a full mailbox sync to find out what the flags are on the server, and lock while you renumbered the server end. A big pain, but very rare. It's essential that the ordering of user flags be exactly the same at both ends for the checksumming to be efficient, otherwise you'd have to calculate an equivalent checksum at each end for the comparison to work. Possible, but much more expensive just to get a yes or no to the "do I need to sync" question.