Archive | Exchange2010 RSS for this section

Exchange 2010 Journaling: Stuck messages in Submission queue

Update: This appears to be a documented bug in  KB2681250
However, the KB notes that it should be patched as of SP2 RU3 and it still appears to be an issue in SP3.  Opening support case now.

Further update: Looks like this is a distinct bug from the reported on in the above KB.  Scheduled to be patched in SP3 RU3.  Specifically, my issue seems to deal mostly with meeting invite replies (Accepted/Declined, etc.)

There’s an issue that appears to pop up periodically with Exchange Journaling which annecdotally, is due to having a BES server attached to your Exchange environment or  if a user has a broken signature in the Outlook client.  It appears some or all of these messages are Calendaring responses with a subject like “Accepted: Meeting Name” or “Declined: Meeting Name”

Messages get stuck in “Retry” in the Submission queue, which shows as an Event 9213 in the Application log:

Log Name: Application
Source: MSExchangeTransport
Date: <date>
Event ID: 9213
Task Category: Categorizer
Level: Error
Keywords: Classic
User: N/A
Computer: <servername>
Description:
A non-expirable message with the Internal Message ID 12345678 could not be categorized. This message may be a journal report or other system message. The message will remain in the queue until administrative action is taken to resolve the error. Other messages may also have encountered this error. To further diagnose the error, use the Queue Viewer or the Exchange Mail Flow Troubleshooter.

Looking at the message info it shows:

Last Error: 400 4.4.7 The server responded with: 550 5.6.0 M2MCVT.StorageError; storage error in content conversion. The failure was replaced by a retry response because the message was marked for retry if rejected.

A note about the line “The server responded with: 550 5.6.0…”  This appears to be the local Exchange server, not the intended destination server.  In context, since the message is in the “Submission” queue and not in one of the destination queues, that makes sense.

Without a way to specifically stop the messages, they can at least be manually removed from the queue using:

Get-Message -Server <servername> -Filter {FromAddress -eq "<>" -and Status -eq 'Retry'} | Remove-Message

Note, the second part of the filter “Status -eq “Retry”.  Normal Journaling messages also fly by with blank from addresses and this avoids removing any of those.

Advertisements

Providing Exchange 2010 Activesync HA with multiple sites

Update: a bug in Exchange 2010 SP3 breaks the mechanism described below.  See http://msitpros.com/?p=1818 for a good writeup.  Apparently slated for a fix in RU2.  As a workaround, I changed the activesync.company.com CNAME to point only to the activesync-site1.company.com record and the errors are resolved.

Assuming you already have a solid handle on DR for your mailboxes, the next most important thing is client connectivity.  Many DR solutions for AS (and OWA too) involve repointing DNS to the DR site in case of an disaster.  Changes to our external DNS (it’s hosted) take well over an hour to propagate.  And that problem can be avoided.

By using extra DNS entries and the built-in proxying and redirection functions in the Activesync service, AS clients can seamlessly failover between datacenters. It requires adding a one extra level to the namespace meaning more Subject Alternative Names on the SSL cert.  Beyond that, I’ve perhaps over-complicated the namespace a bit more here, but I don’t like changing A records, so I have some CNAMEs that can be shifted in a pinch without modifying the As.

The idea is that regardless of what name is used to access the Activesync service, the CAS server will verify that it is in the same site as the desired mailbox and if not, based on the version of the client:

  • EAS 14.0+ clients will get a 451 redirect to the externalURL of the EAS Virtual Directory on a CAS in the same site as the current mailbox location.
  • EAS 12.1 or earlier clients will be proxied to the CAS in the correct site if the /proxy directory is set for Windows Authentication.
    Note: In my case, both datacenters are connected internally by a fast link, so any proxied traffic is not an issue.

The setup is as follows:

Names highlighted in orange need to be listed as SANs or as the common name of the SSL cert.

These two DNS A records are specific to the two sites and used as the domain name of the external Activesync URLs for the two locations.  (these names are used by the client if redirected)
A activesync-site1.company.com   192.168.1.100
A activesync-site2.company.com  192.168.2.100

These overloaded A records are used for the round robin.
A activesync-allsites.company.com 192.168.1.100
A activesync-allsites.company.com 192.168.2.100

The last record is just for my convenience. I want to point it directly at site2 if site1 was down, otherwise it’s always pointed at activesync-allsites.company.com.
CNAME activesync.company.com -> activesync-allsites.company.com

When not using Autodiscover to configure the AS device, it makes it easy to give the user the server name activesync.company.com and letting the back-end functions take care of getting the device connected to the right place.