the next door (so called) geek @rakkimk | your next door geek | friend | blogs mostly on technology, and gadgets.

Windows Azure Web Sites (WAWS) - Collecting dumps of the worker process (w3wp.exe) automatically whenever a request takes a long time

Websites being slow is perhaps the most common problem every website administrator, and developers run into. If they are extremely unlucky, then see this problem only in their production environment. Many troubleshooting techniques, best practices are available for this scenario. I will try to cover them in a different post as a part of my ASP.NET Troubleshooting series some other time. Meanwhile, you can try looking at this post of mine, where I’ve something that might help you.

For now, let’s focus on Windows Azure Web Sites. As you know this is a closed (well, not completely) hosting environment, and still there are a few things that you can do for this problem – for example, you can try collecting FREB traces for a long running request, and see where it is stuck. FREB shows ASP.NET ETW events as well, but has only the page lifecycle events. For example, it will tell you where the problem like Page_Load is, but not what inside Page_Load. To find more, you either have to profile your application, or collect a memory dump of your process serving the request, and see what the request is doing for such a long time.

I’ll put the steps to enable an automatic collection of memory dump whenever a request processing exceeds ‘x’ number of seconds. This is going to use the same customAction for FREB which I’ve detailed in this old post of mine. In WAWS, the customActionsEnabled attribute for the website is set to “true” by default, so you have to just put the below web.config file. In this example, I’m going to use Windows Sysinternals procdump.exe to take the dump of our process (w3wp.exe). Here are the steps:

Enable ‘Failed Request Tracing’ from the Portal

First, you need to turn on FREB from your management portal. This article has the brief steps how to view those logs from Visual Studio, and even configuring it from there. From the portal, for your website, under configure tab -> site diagnostics, set the below to On.

clip_image001

Transfer Procdump.exe to your deployment folder using FTP

Second, you need to put procdump.exe in your website deployment folder. Download it to your local machine from here. You can create a new folder, and place it in there, let that folder be the path where the dumps be stored as well. In my example, I’ve created a folder called ‘Diagnostics’ under the root, and placed the procdump.exe in there. Screenshot of my FileZilla:

clip_image002

Configure the web.config with configuration to collect dump

Lastly, you need to place the below configuration in the web.config file to enable procdump.exe to be spawned with certain parameters whenever the request exceeds 15 seconds, in this case:

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<system.webServer>
  <tracing>
  <traceFailedRequests>
    <remove path="*" />
    <add path="*" customActionExe="d:\home\Diagnostics\procdump.exe" customActionParams="-accepteula w3wp d:\home\Diagnostics\w3wp_PID_%1%_" customActionTriggerLimit="5">
     <traceAreas>
       <add provider="ASP" verbosity="Verbose" />
       <add provider="ASPNET" areas="Infrastructure,Module,Page,AppServices" verbosity="Verbose" />
       <add provider="ISAPI Extension" verbosity="Verbose" />
       <add provider="WWW Server" areas="Authentication,Security,Filter,StaticFile,CGI,Compression,
                                         Cache,RequestNotifications,Module,FastCGI"
                                  verbosity="Verbose" />
     </traceAreas>
     <failureDefinitions timeTaken="00:00:15" />
    </add>
  </traceFailedRequests>
  </tracing>
</system.webServer>
</configuration>

 

Above configuration will take a mini-dump of the w3wp.exe serving your WAWS site, and put it in the folder d:\home\Diagnostics with dump name having it’s PID. And, if you want a full dump, you can have -ma parameter added. Example customActionParams="-accepteula -ma w3wp d:\home\Diagnostics\w3wp_PID_%1%_".

You could use any other additional switches that typically use for ProcDump. For a slow running page scenario, I might collect dumps at regular intervals – 3 dumps with 5 seconds interval each, so that we can check what the request is doing across these timings. For that you can set the customActionParams to “-accepteula -s 5 -n 3 w3wp d:\home\Diagnostics\w3wp_PID_%1%_”.

Hope this helps!

Extending Glimpse for ASP.NET to include custom Timeline messages

23. August 2013 20:15 by the next door (so called) geek in ASP.NET, Glimpse

If you are debugging any ASP.NET MVC or WebForms, and you haven’t used Glimpse yet, you are missing big! This also has nice extensions to other frameworks like EF, nHibernate, etc. This collects the server side data, and presents an overlay in the client side. Information such as the configuration, environment details, request details, server variables, traces, and most importantly the timeline of events. I’ll let you explore more about Glimpse, and it’s features from their website. Here is an example of how the timeline looks for a sample MVC4 application:

image

Adding your own message in the timeline seems to be a feature that’s getting worked upon by the Glimpse team. Until they release an easier way, here is what you can do to get your custom messages in the timeline. Here are the easy steps.

  • You need to create a new Class deriving from Glimpse.Core.Messaging.ITimelineMessage, and implement it’s properties, and add a new constructor which you will call from your code later to initialize a message to be added to the Glimpse timeline for the request.
  • Grab the Glimpse Runtime in the code.
  • Measure the time between your calls.
  • Publish the message via GlimpseRuntime.Configuration.MessageBroker.Publish<T>(message) method.

Here is the sample code for the above steps:

// grab the glimpse runtime
Glimpse.Core.Framework.GlimpseRuntime gr = (Glimpse.Core.Framework.GlimpseRuntime)HttpContext.Application.Get("__GlimpseRuntime");
 
// get hold of the timespan offset
TimeSpan t1 = new TimeSpan((DateTime.Now - gr.Configuration.TimerStrategy().RequestStart).Ticks);
 
// start a stopwatch
Stopwatch sw = new Stopwatch();
sw.Start();
 
// your function that you want to measure the time
System.Threading.Thread.Sleep(1000);
 
// stop the stopwatch, and get the elapsed timespan
sw.Stop();
TimeSpan t2 = sw.Elapsed;
 
// create the message with grouping text as appropriate
MyTimelineMessage msg1 = new MyTimelineMessage("WebService Calls", "first call to fetch x", "something here too", t2, t1, gr.Configuration.TimerStrategy().RequestStart, "red");
// publish the message to appear in the glimpse timeline
gr.Configuration.MessageBroker.Publish<MyTimelineMessage>(msg1);
 

Once this message is published, this will appear like below (depending on when you have called this in the request processing):

image

Here is my sample MyTimelineMessage.cs that you can use:

using Glimpse.Core.Message;
using System;
using System.Collections.Generic;
 
/// <summary>
/// Summary description for MyTimelineMessage
/// </summary>
public class MyTimelineMessage : ITimelineMessage
{
    TimelineCategoryItem _timeLineCategoryItem;
string _eventName, _eventSubText;
    TimeSpan _duration, _offset;
    DateTime _startTime;
    Guid _id;
public MyTimelineMessage(string itemName, string eventName, string eventSubText, TimeSpan duration, TimeSpan offset, DateTime startTime, string color)
    {
        _timeLineCategoryItem = new TimelineCategoryItem(itemName, color, "Yellow");
        _eventName = eventName;
        _eventSubText = eventSubText;
        _duration = duration;
        _offset = offset;
        _startTime = startTime;
        _id = new Guid();
    }
 
public TimelineCategoryItem EventCategory
    {
        get { return _timeLineCategoryItem; }
        set { _timeLineCategoryItem = value; }
    }
 
public string EventName
    {
        get { return _eventName; }
        set { _eventName = value; }
    }
 
public string EventSubText
    {
        get { return _eventSubText;  }
        set { _eventSubText = value; }
    }
 
public TimeSpan Duration
    {
        get { return _duration; }
        set { _duration = value; }
    }
 
public TimeSpan Offset
    {
        get { return _offset; }
        set { _offset = value; }
    }
 
public DateTime StartTime
    {
        get { return _startTime; }
        set { _startTime = value; }
    }
 
public Guid Id
    {
        get { return _id; }
        set { _id = value; }
    }
}

Hope this helps!

Debugging the (client) ASP.NET SignalR application that someone else wrote

Debugging the code that you haven’t written is a art! We always look for some help from the application developer that would perhaps tell you what’s going on in the application when you perform certain actions. With the rich client applications becoming the main stream, it’s very important that the libraries, and client side frameworks provide with additional debug statements emitted to the browser, so that anyone who are trying to debug the application, gets a little help.

ASP.NET SingalR is one popular library for the ASP.NET developers to build rich, real-time web applications. SignalR’s client side library emits useful debug statements in each event that’s occurring, and that’s very helpful if you are trying to debug your application. For example, if you have the logging enabled for your hub, you would see the below events during the connection establishment. I’m using the sample application that’s build in this tutorial for this blog.

image

Yeah, that’s the awesome looking F12 Developer tools on IE11 Preview. Try it out, and read more about it’s new features here. I’ll perhaps talk about that in a different post later. Let’s come back to our issue at hand. If the developer of this application is you, or someone who is kind enough to turn on logging, they would do this. Most of the time when you are trying to debug other’s production application (like me for living), or some out of curiosity, you will try to find options to ease your job. Here in this case, if I can turn on the logging for the hub, I’m golden. Let’s see how to do this.

Logging for the SignalR hub, if you are using the automatic proxy generated, can be enabled by the below code:

$.connection.hub.logging = true;

Again, in our case, we do not have the access to the code. Somehow, you need to enable the logging in the browser for the application instance loaded inside. Yeah, you guessed it right – console. Most Browsers include Developer Tools that include ‘Console’. That’s most of the time used to emit debug information to understand what’s going on, and most of the time, the developers turn the debug statements on/off via switch, same as ASP.NET SignalR client side library. Do the below steps to enable the logging like above, even if the code doesn’t have it enabled:

    1. Browse to your ASP.NET SignalR application.
    2. Open F12 Developer Tools, and go to the ‘Console’
    3. Type the same code as above to enable the logging - $.connection.hub.logging = true; in the console window, and hit return.
    4. You would now start seeing all the debug messages that SignalR framework emits.

image

However, in this case, we enabled the logging after the connection was established, and so on. However, most of the time we would debug issues on the connectivity itself. In those times, start debugging your page with IE11 F12 Developer tools, if you have noticed, in the latest version, you do not need to reload the page! Set a breakpoint in the $.connection.hub.start() method (inside jquery.signalR-1.1.3.js), and refresh the page. Once the page is refreshed, you could execute the same code in the console as above, so that the option to logging is set. Debug statements that the client side library emits is pretty verbose. For example, when I turned off WebSockets in my IE11, I see the below during the connection establishment, and you would see it switching to long polling mode.

image

Hope this helps someone who want to understand what’s going on in someone else’s ASP.NET SignalR at the client side.

Defining Common issues in ASP.NET applications - Crash.

This blog is a continuation to my previous one in this Troubleshooting series that I’m planning to continue writing. If you haven’t read that one, please read before beginning this one. In the last blog, I spoke about a few questions, and my first step of any troubleshooting – isolation, which helps me narrow down my search, and gives me opportunity to do concrete troubleshooting rather than trying out a couple of random things like rebooting. Yeah, that does fixes a lot of problems, but will you understand the reason why the problem really happened after doing a reboot. Possibly not for a lot of instances. If your application is super smart in doing more logging, and the problem that happened can easily identified in the default logs like event logs, IIS logs, httperr logs, you are golden. Otherwise, you might be pushed by your boss to do an RCA, but you do not have sufficient data.

Defining Crash

In this blog, I’m going to talk about one of the few common issues, and what possibly you need to do during these issues. Again, this is not going to be the complete list of things that you can do, but at least will give you a head start. You know what crash means, but will you be able to identify an end user scenario, and right away classify as crash. No. Not all the time. You know that your code needs to run inside a process in Windows. I define it has “crashed” if the process has exited because of unknown reasons, terminated unexpectedly. If you are running your ASP.NET websites on IIS7+ servers, it is the w3wp.exe process that runs your code.

Understanding Crash, and it’s symptoms

What are the possible symptoms that you can notice in your applications to term this as a crash, or what you can possibly see from the existing logs. I’ll also try talking about a few common tools that will help you in troubleshooting the crash.

First of all, let’s talk about a few common errors, issues that is noticed by the end user:

  • Session Loss – This is the most common symptom for people who store the ASP.NET Session InProc. When the ASP.NET Session mode is configured InProc, your session variables (and values) are stored inside the w3wp.exe process corresponding to the Application Pool that’s configured to run your application. One of my colleague has earlier written briefly on a few questions to ask, and possible reasons for this in this article. Please read it, it is such a valuable ‘session loss’ troubleshooting guide. Often people term as ‘Session Loss’ if their application perhaps tells them that they are logged out of the system, and asking them to login again. Few application developers do custom logging, and they store some session variable when your session starts, and checks for that value in a few pages for custom logic, and show you the “logout” message if that variable doesn’t exist, forcing you to login again that will initialize the session variables again.
  • Webpage keeps spinning for a long time, and gives you the result – This is again a most common scenario, but the end user reports this as a slow performance rather. If the process exits, it takes down all the initialization that you have done, be it session variables, or ASP.NET cache, or something that you have initialized. So, most common logic is, to re-initialize them if it is not available, so you spend time in re-initializing it, getting from database, reading from file system, and so forth.
  • imageYou see ‘Service Unavailable’ error in the page – This is also most common, where the process serving the Application Pool has terminated unexpectedly for x number of times in y seconds. Default configuration in IIS is, if your process crashes for 5 times under 5 minutes, the Application Pool is disabled, or stopped. The Administrator has to manually start the Application Pool in order to get the site back again. You can always change this option. I’d say, better leave it as default – so that it at least prompts you to fix this problem. If you disable this option, your crashes will go unnoticed, unless you pay attention to the user experience, and in the event logs. You can configure this under the ‘Advanced Settings’ of an Application Pool, under Rapid-Fail Protection. You can also configure other options like, a custom executable to run in case if this AppPool gets disabled due to this ‘Rapid Fail Protection’ feature of IIS.
  • Other custom error messages which your application might throw in case if the initialized data becomes unavailable from the process memory.

 

Event logs that gets generated for a crash of the Application Pool

Here are a few event descriptions that you would see if a crash occurs:

Log Name:      System
Source:           Microsoft-Windows-WAS
Date:               [time stamp]
Event ID:          5011

Description:
A process serving application pool '[app pool name]' suffered a fatal communication error with the Windows Process Activation Service. The process id was '[PID]'. The data field contains the error number. 

 

Description:

A process serving application pool 'DefaultAppPool' terminated unexpectedly. The process id was '[PID]'. The process exit code was 'exit code'.

 

Application pool '%1' is being automatically disabled due to a series of failures in the process(es) serving that application pool.

 

Event ID : 1000
Raw Event ID : 1000
Record Nr. : 15
Category : None
Source : .NET Runtime 2.0
Error Reporting Type : Error
Message : Faulting application w3wp.exe, version 6.0.3790.1830, stamp 42435be1, faulting module mscorwks.dll, version 2.0.50727.42, …

Now, you have seen how to define a crash, and possible symptoms, and event logs. But what next? You need to find the cause of the problem, right? You first should looks for clues in the event logs to see if there is anything logged from IIS/ASP.NET components during the time of the issue, other than the few of the above ones. You might be even lucky to spot a 3rd party module that’s causing the crash, but not all the time. What possibly can help you is a memory dump of the process, which been captured just before the process dies. There are a few tools that can help you collecting the memory dump of the crashing process, few which can analyze the dumps for you to an extent to show you the crashing stack. For the collection, you also have an inbuilt option that saves these memory dumps called, Windows Error Reporting. You can read the below blogs that shows you the steps to collect the dumps for this scenario.

Using Windows Error Reporting

Using WER: Collecting User-Mode Dumps
http://msdn.microsoft.com/en-us/library/bb787181(VS.85).aspx

 

How To: Collect a Crash dump of an IIS worker process on IIS 7.0 (and above)
http://blogs.msdn.com/b/webtopics/archive/2009/11/25/how-to-collect-a-crash-dump-of-an-iis-worker-process-w3wp-exe-on-iis-7-0-and-above.aspx?wa=wsignin1.0

 

How to use ADPlus to troubleshoot "hangs" and "crashes"

http://support.microsoft.com/kb/286350

 

Using DebugDiag Tool

How to Use the Debug Diagnostic Tool v1.1 (DebugDiag) to Debug User Mode Processes

http://msdn.microsoft.com/en-us/library/ff420662.aspx

 

Other tools that can help you is, ADPlus (that comes with the Debugging Tools for Windows) article, and ProcDump from the Microsoft Technet Sysinternals (-t option). DebugDiag tool comes with a powerful analyzer as well, where you can just double click the dump file that was collected, and it will create a beautiful report that consists of the crashing callstack, and a possible explanation/next steps for the issue. If you are interested to debug the dumps collected, below links could be super helpful! Tess is known for her brief blogs on dump analysis, and a great person to interact!

.NET Debugging Demos Lab 2: Crash

http://blogs.msdn.com/b/tess/archive/2008/02/08/net-debugging-demos-lab-2-crash.aspx

 

Hanselminutes on 9 - Debugging Crash Dumps with Tess Ferrandez and VS2010

http://channel9.msdn.com/Shows/HanselminutesOn9/Hanselminutes-on-9-Debugging-Crash-Dumps-with-Tess-Ferrandez-and-VS2010

 

I’ll follow up with more posts on general troubleshooting.

Troubleshooting ASP.NET Applications running in IIS7 and above

If you do not know me, I work as an Escalation Engineer in Microsoft IIS/ASP.NET Support team in Bangalore, primarily debugging ASP.NET applications of our customers. That’s my day job – to debug other’s code, also any Microsoft Component that’s involved. I ‘m planning to write a series of posts on general troubleshooting, and the steps I typically use to diagnose a customer problem. You could use the same, and it is definitely not a rocket science. If I can do this, you can too!

In this first post, I thought I will not talk anything technical, but probably a few things that you might want to do before you start the real troubleshooting. Let’s take a scenario of a ‘slow running’ ASP.NET website at hand, and see how do we approach this problem step by step. If you have already worked with Microsoft Support before, we would generally ask you ‘many’ questions. All of those questions are asked typically to understand the problem better. For example, for this slow request problem, here are the typical questions that you would need to ask yourself when troubleshooting. Few of these questions apply for ‘any’ problem that you troubleshoot.

  • Issue is slowness, but how slow it is? First, you need to understand how much is the delay you are talking about, so that you can think of using a few tools to troubleshoot this quickly.
  • It is slow, but how fast it should be? What is your expected time that page should respond in? If you do not have a benchmark, then you are shooting in the dark. In reality, this number would be a result of your testing. You would know how much the page typically takes, depending on its operation. Of course, if it is running a lengthy database operation, this number itself would be a larger one. It is very essential that you have a comparison.
  • Environment of the server. Next, you would need to know where the problem is occurring. If you have multiple servers, which are those servers this slow performance problem occurs? It is possible that those servers where the problem occurs are really a slow ones, having an old hardware. Understanding the details of the environment is very important.
  • Environment of the client. You own the server, but the problem perhaps is reported by your end users. You should try to know the environment of those end users as well. If it is happening from many users, you might want to understand about all ranging from their OS version, browser versions, to network topology.
  • When did the problem start? Yes, this would be the most interesting question of all, but this is the one which might not get any ‘right’ answers most of the time. One of the many reasons could be, there perhaps were too many changes that were done. This ranges from deploying the application in a new server, installing a new service pack for the OS, or the application upgrade, to adding new users to the applications. Clear understanding of this would help diagnose the problem better.
  • What exactly is slow? This is another tough question to answer most of the time since there might be many pages that are slow, and you may not know all. But, it is very essential that you list the name of the page, and the operations you do on the page that gives you the problem. For example, a button click on the login page is slow to give the response.

 

Again, these are a few important questions to ask, not the only questions. If I get a chance to talk to you while diagnosing your problem, I’ll perhaps ask 100 more questions – definitely related to the problem :) Okay, what next? You get the answers to these questions, what’s perhaps your next step?

My first step in troubleshooting is, always ‘Isolation’. Try narrow down your search. First split your main problem into pieces, to troubleshoot. For example, one button click might do 10 different activities, try isolating what in that 10 has the problem, so that you can try concentrating only on that particular activity that is slow. Isolation step will also include you trying to check if the problem is isolated to only a few users, or all the users. If it is only for say 2 of your users from their workstations, you have already avoided concentrating on the server, perhaps they have a slow network. This is just an example, your problem well could be in the server even in this case of just 2 users facing the problem, like custom code for them, the query that gets generated for them is different, etc.

Once you isolate the problem, the very important next step is to make sure you aren’t troubleshooting something which is already resolved in say, the latest release of your website. Always, do not try reinvent the wheel. Do not waste your CPU cycles (!) to work on an issue that someone has already fixed. Search. Search in support.microsoft.com, search in StackOverFlow, search in Bing, most important, search in your internal database if you have one, for issues that are already fixed. If you are at a critical problem, make sure you do enough search before trying to dig deeper. Again, this step doesn’t apply for some problems that are isolated to just your application, which is the case most of the time, like this slowness that we were talking about. But for issues like, Exceptions, runtime errors, etc.

Only after you have a clear understanding of the problem, and the environment that this is isolated to, and making sure the issue is not a known issue, you may proceed further. I can in fact write more in this post, but I’ll reserve more like this, general troubleshooting techniques for my future posts. If you are curious, here is what I’m planning to write further on. I’m sure I’ll add more to this list, and perhaps will update this post when I do.

  • Defining Common issues in ASP.NET applications – Slow Perf, Hangs, Crashes, High Memory, etc.
  • Built In tools that helps you troubleshoot a few of these issues.
  • How much existing logs like IIS logs, HTTPERR logs, Event logs would tell you?
  • Scenario #1 : Troubleshooting a slow performance problem using various tools.
  • More Scenarios, more tools, whenever I find time to write.

 

Follow along if you are interested.