Odd Response Group Delay issue

Sorry for the lack of update lately.  I’ve been working on a deployment and have run into an odd issue with response groups.  I won’t go to far into detail as it seems to be environment specific, but the basic idea is calls to response groups take about 5 seconds longer than calls to a standard agent to complete.  By complete I mean the voice stream starting and call controls showing up.  Hopefully this will make for an interesting blog as we’ve done a ton of troubleshooting with the fine folks at Microsoft and have a lot of real life examples of the call process, and where it’s going wrong in this case. 

I do want to say this is an isolated incident, please keep in mind this isn’t something we expect to see from OCS, and I’m sure it will get fixed very quickly, wherever the problem is.  If you have comments/thoughts, or would like to discuss in detail please post back here and I’d be happy to IM or email back and forth about it.  It’s quite interesting, at least if you are a nerd like me!

Advertisements

About Kevin Peters

My name is Kevin Peters.
This entry was posted in Uncategorized and tagged , , , , , , . Bookmark the permalink.

10 Responses to Odd Response Group Delay issue

  1. Kevin says:

    We are seeing the same behavior in our environment. We removed load-balancers and firewall and have a direct path from outside calls (Mediation server) to the call destination (end users). We are also working with Microsoft and have provided a large number of logs that they’ve been reviewing. Would be interested to hear what you’ve discovered.

    Thanks

    • Kevin Peters says:

      Kevin,

      What we saw was a long pause in traffic (half second or so) between the first bye from the client to the re-invite and the RTP packet flow beginning. I believe we’ve done a good deal of discussing this on the ocs forums. I’d be happy to look at your logs, but I’d suggest checking your communicator ETL files and looking for failures there. In most cases I’ve seen this it was AV. I believe we also discussed pulling your A/V edge config temporarily and restarting the pool for testing. Did you try that? Feel free to email me back kevin (at) this domain, we can exchange info there.

      Thanks!

      -kp

  2. Luke says:

    Have you checked the client NIC drivers? May be a long shot, but we spent ~5 days troubleshooting a case with Microsoft regarding a similar issue (although the time was slightly longer). The issue arose with the way the client attempts to subscribe to the Edge server (even on PSTN calls) in conjunction with an older driver version.

  3. Kevin says:

    Hello –

    I would be interested to know if you were able to find the resolution to this as we are having the same issue now. One interesting note for us is if the call is external coming in it is about 3 seconds faster, but for MOC to MOC it is a 5 second delay.

    Thanks!

    • Kevin Peters says:

      Hi Kevin,

      We resolved the response group delay issue by turning off https inspection in Kaspersky AV in this case. There are a number of other factors that could cause this issue though, including inability to talk to the edge on 443. I would recommend running a wireshark capture and turning on logging in MOC and replicating the issue. You can also try to use an OCPE (phone edition) device to see if it occurs there or not. If it doesn’t occur on the OCPE then its likely to be a client side issue. Feel free to post back with more details if the above suggestions don’t work.

      Thanks!

      -kp

  4. Matt Parkinson says:

    We are currently seeing this same issue with a deployment we are working on and so far I have not been able to find the fix. We see anywhere from 2 to 4 seconds delay and your blog posts have made an interesting read on the issue and what can cause it.

    On the Snom OCS phones it says it’s connecting back to the mediation server after you answer it and as soon as that connects the call and the audio completes connection. This is also the same on MOC.

    From my understanding the issue is usually related to the implementation and not a generic bug we should be seeing in OCS. So far I have tried removing the A/V edge and also changing the certificate to a non san one which hasn’t made too much difference so far.

    Do you have any advice on the issue on what I should be looking for?

    Our setup has a Standard edition FE, Mediation and Edge server.

    • Kevin Peters says:

      Hi Matt,

      Typically the delay’s are related to the implementation and not the product as you stated. The exception to that is the Tanjay (CX700), prior to CU5 they had a delay as well.

      On typical deployments we do see a bit of a delay, but it is not typically 4 seconds. Sounds like you have done most of the normal troubleshooting, I’ll throw a few other things out that may help.

      Temporarily disable AntiVirus and Firewalls on the OCS servers
      Install the latest updates for OCS and the MOC client (CU6 is the current)

      Do some logging on the Front End server (SIP Stack) and client to see what is happening. A network trace from the client will be good to look at as well.

      Last, but not least, look for patterns. Is the problem hurting the SNOM phones more than the MOC clients or Tanjays?

      Feel free to email me and we can exchange IM addresses to discuss further.

      Thanks for reading!

      -kp

      • Matt Parkinson says:

        Hi Kevin,
        Thanks for the information. The thing I had not tried that you have suggested was installing the updates which I have now done however the problem still exists.

        I checked on both the Snom and MOC and the delays are pretty much the same. What I find though is that after I have made 2 or 3 calls within a few minutes the delay drops from 4 to 2 seconds.

        I will shoot you an e-mail with my IM if that’s ok as it may be easier to chat then. I will try and find one that we haven’t moved over to OCS PIC so it’s not running on the server.

        Thanks,
        Matt Parkinson.

  5. LeanIT says:

    Hi there, we are experiencing a similar issue with a 4-7 sec delay when answering a response group call. We are actually running Lync server. Basically calling works fine with less than a sec delay. My guess is there is a config issue somewhere but I’m not sure where to look. Nothing out of the ordinary shows up in the logs. I’ve seen some chat about the edge server. Does the edge server play a role in internal calls to a response group? Any help you can provide on where to start in troubleshooting this resp group delay issues would be appreciated.

    Tks
    Derek

  6. Kevin Peters says:

    Hi Derek,

    The first place to look is definitely at the edge. The edge server connectivity is verified as part of call establishment and if the user cannot reach the internal interface of the edge server the call will be delayed. Many times we have seen this delay happen because of clientside antivirus blocking the connectivity to the edge, so you may want to turn off AV and test a call. Also, please verify there are no SANs on the certificated for your internal edge. The CN on this certificate should be the FQDN of your edge server and no other names should be on it. If none of those help you can either break out snooper to examine the client and server logs, or call Microsoft and open a case.

    Hope this helps!
    -kp

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s