Cyrus Carnegie Mellon

Project Cyrus Archives: IMAP Aggregator and ACAP
Computing Services
Home
ASG Home
What's New
Cyrus Wiki
Employment
Contact Us
Google

This document is obsolete. Most of the changes have been incorporated into the 2.0 version of ag.html.

introduction

The Cyrus IMAP Aggregator will use ACAP as the central repository for mailbox information. This will enable next generation clients to efficiently bypass the frontend servers and achieve additional functionality in terms of a "master update server" service. This document contains notes on the future implementation.

definitions

Here are terms I'm using:
mailbox tree
The collection of all mailboxes at a given site in a namespace is called the mailbox tree. Generally, the user Bovik's personal data is found in user.bovik.
mailboxes database
A local database containing a list of mailboxes known to a particular server. (In old Cyrus terms, this maps to /var/imap/mailboxes.)
mailbox dataset
The store of mailbox information on the ACAP server is the "mailbox dataset".
mailbox operation
The following IMAP commands are "mailbox operations": CREATE, RENAME, DELETE, and SETACL.
quota operations
The quota IMAP commands (GETQUOTA, GETQUOTAROOT, and SETQUOTA) operate on mailbox trees. In future versions of Cyrus, it is expected that a quotaroot will be a subset of a mailbox tree that resides on one partition on one server. For rational, see section xxx.
DUMP/RESTORE
It would be nice to have an in-protocol way of dumping out an entire mailbox (with associate flag state) and restoring it on the same or a different IMAP server.
IMAP connection
A single IMAP TCP/IP session with a single IMAP server is a "connection".
IMAP client
A client is a process on a remote computer that communicates with the set of servers distributing mail data, be they ACAP, IMAP, LDAP, or IMSP servers. A client opens one or more connections to various servers.

assumptions

I'm going to make the following assumptions about mailbox operation:

operation and implementation

mailboxes database

internal consistency versus external consistency. internal consistency ensures the database is not corrupted. this is easy to verify for flat text files, harder for B-Tree databases. external consistency ensures that the database accurately reflects information on the spool disks or on the backend servers.

what sort of database? flat text: simple, currently works, and gives "always internally consistent". plus, readers can operate on older versions while a writer updates the master version and swaps it in.

one writer/many readers berkeley db: offers better performance for lookups, but may actually hurt performance since the writer needs to guarantee that no readers are present before it starts writing. if we crash, there's no guarantees on internal consistency.

transactional berkeley db: incurs heavy locking and logging overheads, but possibly offers the best performance. allows multiple readers and multiple writers, allows lock upgrading. however, applications need to deal with deadlock and crashing can be disasterous. ensures internal consistency.

what's true? the theoretical list of mailboxes we're trying to distribute is the mailboxes actually exist in a spool disk on all of the backend servers. As we get farther away from that, the consistency of the data will degrade. What sort of degradation can we afford? Is the game to let the data get as inconsistent as possible while maintaining client operation? Keep the data as consistent as possible while maintaining acceptable performance?

the frontend servers

Two classes of processes, the proxyd and the murder-front synchronization process.

The proxyd handle the IMAP session with clients. It relies on a consistent and complete mailboxes database that reflects the state of the world. It never writes to the mailboxes database.

What happens if the mailboxes database on the frontend is out of date? Can proxyd act proactively? When a client SELECTs a mailbox that doesn't appear in the database, proxyd sends a signal to murder-front to make sure all updates have occured.

The murder-front process (one per frontend) holds open an ACAP context and listens for updates from the ACAP server (that come from the backend servers making modifications on the mailbox dataset) and makes these modifications on the local copy of the mailboxes database. Should the murder-front process totally rewrite the mailboxes database when it starts up?

All mailbox operations get forwarded to the appropriate backend server, so the only one that's tricky is CREATE. To CREATE foo.bar (all danger of inconsistency rests in the hands of the backend server):

  1. proxyd: verify that foo.bar doesn't exist in local mailboxes database
  2. proxyd: decide where to send CREATE (hierarchy, random, client selection via third argument)
  3. proxyd -> backend: duplicate CREATE command

To SELECT foo.bar:

  1. proxyd: lookup foo.bar in local mailboxes database
  2. if yes, proxyd -> backend: send SELECT
  3. if no, proxyd -> murder-front -> ACAP: UPDATECONTEXT. we need to make sure that all changes the ACAP server have received have been propogates to the frontend.
  4. if mailbox still doesn't exist, fail operation

This makes SELECTs on mailboxes that don't exist much more expensive. I don't think this is a problem.

To RENAME foo.bar aaa.bbb: Do we allow cross-server renames?

the backend servers

Each backend server maintains a local mailboxes database, listing what mailboxes are available on that server.

The imapd processes on the backend server stand by themselves, so that each backend IMAP server can be used in isolation without an ACAP server or any frontend servers. However, they may be configured so that they won't process any mailbox operations unless the master ACAP server can be contacted (allows for namespace consistency).

The imapd processes update the local mailboxes database themselves. However, on a CREATE they need to reserve a place with the ACAP server before proceeding with the creation. Thus a flag in the mailboxes dataset needs to be reserved for "in progress".

To CREATE foo.bar:

  1. imapd: verify ACLs to best of ability (CRASH: aborted)
  2. imapd may have to open an ACAP connection here if one doesn't already exist
  3. ACAP -> imapd: verify parent ACLs if need be (CRASH: aborted)
  4. imapd: start mailboxes transaction (CRASH: aborted)
  5. imapd -> ACAP: set foo.bar inprogress (CRASH: ACAP externally inconsistent)
  6. imapd: create foo.bar in spool disk (CRASH: ACAP externally inconsistent, backend externally inconsistent)
  7. imapd: add foo.bar to mailboxes dataset (CRASH: ditto)
  8. imapd: commit transaction (CRASH: ACAP externally inconsistent)
  9. imapd -> ACAP: set foo.bar active (CRASH: committed)

Failure modes: Above, all backend inconsistencies result in the next CREATE attempt failing. The earlier ACAP inconsistency results in any attempts to CREATE the mailbox on another backend failing. The latter one makes the mailbox unreachable and uncreatable.

To RENAME foo.bar aaa.bbb:

mail delivery

urg. sieve sure does complicate things. here are some proposals.

early sieve: The sooner that mail is run through Sieve scripts, the better. We want to reduce the total number of hops mail takes before it lands in the appropriate mailbox. This can be translates into the MX (remote submission) and SMTP (local submission) servers running the Sieve scripts themselves. This requires any submission host to contact the ACAP servers for the user's Sieve script, and for the mailboxes dataset to determine where to route the mail (via LMTP) when a Sieve script calls "fileinto" or "keep".

frontend sieve: Since the frontend servers already keep a copy of the global mailboxes database, they can easily process Sieve scripts efficiently. They still need to use LMTP to transfer messages to the final backend destination.

backend sieve: Since different backend servers are unaware of each other, running Sieve scripts on the backend has several disadvantages. Messages have to be routed to the backend server that holds the user's INBOX, and then the Sieve processing happens. Any fileinto actions that refer to non-local mailboxes fail. This breaks backend server transparency.