Programming and Intuition, Part II

May you read the first part of Programming And Intuition. I promised to add some more experience about that and it didn’t take a long time for getting into a situation again where intuition and experience helped me a lot.

This time at a different customer. Certainly it is pretty easy in the Excel ecosystem to have different environments. The only moreless stable environment up to now is the operating system, mostly Windows Server 2008 R2. But even there are many different versions available, Enterprise, Data Center.

This customer used the Data Center and Excel 2010. In the first run everything look just fine. When they started to build up the reporting, in this case with XlCubed, they experienced time outs when processing long running jobs.

We surely can have a very detailled log. This is one of the major must-haves. Without the log, we would kill ourselves.. and certainly not be able to solve issues, even the easy ones would be hard.

What does that mean, time outs in Excel?

As it is always possible that Excel hangs, gets unresponsive or shows a message box there is mechanism in place that ensures that a non responsive Excel’s process is killed to not interrupt the overall processing. This mechanism killed a responsive Excel process.

Certainly it doesn’t know if it is really working or not. As it can be that Excel works but never comes back again, this is pretty hard to detect.

What happened? The reporting at the moment is pretty simple. Just a report that is going to be looped over some regions to get data from all of them and put it into slides. Watching the log it was clear that the same processing takes longer, the longer Excel is processing Workbooks. A small table from our testing labour:

ProcessOrder Duration
100 00:06:37.000
99 00:06:29.000
4 00:00:59.000
3 00:00:57.000
2 00:00:53.000
1 00:00:55.000

We start processing, the complete process of one Excel report with multiple slides takes 55 seconds. This report is looped over 100 regions. The last one needs 06:37, more than six times longer. Certainly the code interacting with Excel is identical.

What’s happening? The log tells us, that mainly four actions get longer:

Open Workbook
Close Workbook
Export pictures
Disable Alerts

What can we do about it?

This is the class handling case 4.:

[code language=”csharp”]
var application = context.Application;
application.Object.AskToUpdateLinks = false;
application.Object.AlertBeforeOverwriting = false;
application.Object.DisplayAlerts = false;
application.Object.EnableEvents = true;
application.Object.FeatureInstall = MsoFeatureInstall.msoFeatureInstallNone;
application.Object.AskToUpdateLinks = false;
application.Object.AlertBeforeOverwriting = false;

There is really nothing special. The log gives information about how long the processing takes. It starts from 300 ms up to 10 minutes. No, I am not kidding. This is Excel in real life.

Starring at these lines of code can be frustrating. My eyes directly hang on FeatureInstall. This is the process of intuition that I mean. I didn’t consider the rest of it. The second was EnableEvents. It really does something (when you turn it off at least).

I did change this class to only set properties when the property itself has a different value. Why I didn’t do that directly? Good question. Because it is ugly. Because it may be Interop and defensive programming is really necessary but not for every price. “Normally” this doesn’t lead to any problems, when running under Excel 2007 or Excel 2013. But Excel 2010…

After that small tweak, the processing increased considerable. There have been two other things I did.

  • Having a look onto all the logs, I couldn’t identify a real problem in my code. Certainly only interaction with Excel or Excel actions like open workbook, calculation or the like lead to consume more time the longer Excel runs. So a very obvious thing to do is to tell the operating system, that the process in question shall be handling with higher priority.
  • What about the Interop objects that are not in use anymore? Sure, you never use the Garbage Collector in any .net language… unless there is no other possibility. Five more places in code that needed to call the Garbage Collector directly to ensure no Interop objects are available that are not in use.

The result was considerable better. The processing still takes longer the longer Excel runs, but coming back to the table above, the processing ended about 2:30 instead of 6:35.

Excel is a beast, and it can be killed by gut feeling 😉