Sunday, October 16, 2011

Adapting Snoop WPF to multi-AppDomain applications. Snooping Excel add-ins.

First of all, if you gonna skip all the lyrics and just try the app.

Please keep in mind that it will Snoop your AppDomain-bound sources only if they are running maximal .NET version amongst all the AppDomains of the process being snooped.

If several domains of the current process are running under .NET 4 run-time, and your target app domain (add-in) is running under .NET 3.5 (i.e. runtime .NET 2.0),  you won't snoop it, because of the details provided below.

But still you can adjust the logic (or hardcode one path) to be able to snoop it ANYWAY. The details follow in this post.

I haven’t adjusted any code for Zooming. You can do it yourself, the provided version should just Snoop.
As a side effect you'll also see snooping windows for SnoopUI itself (because they also reside in some app domains). Please just disregard them, I was not aimed to struggle against irrelevant windows.

The sources are available for download in the end of the post.

Hello world.

I should’ve written this post about three weeks ago, when most of the ideas had already been dug out, but the working day from 9 till 5 and lots of my off-work duties (accompanied by busy weekends) prevented me from doing this. Looking back, I’m happy it happened this way, because I had time to reflect smth in my mind and improve smth. I want to thank Maciek Rakowski (from Snoop WPF development) for his efforts and investigation, and my colleagues from EPAM Systems, Maksim Volkau and Dzmitry Lahoda, for finding some issues in the results of my iterative approaches. I mustn’t but say, sometimes I didn’t believe I would make this work, and several times I decided to give it up completely, not speaking about how many times I was beating my head against the wall (of course, figuratively Smile).

I can’t but mention that for any WPF developer who is dealing with XAML (I think the majority of WPF developers do), leveraging tools like Snoop WPF should happen almost as often as usage of  Visual Studio. I’m not insisting on choosing this one, its competitors like WPF Inspector might be even better in some scenarios. I really mean these tools are not just simple Visual tree visualizers (like UISpy which was recently converted into Inspect.exe), because of their .NET nature. E.g. Snoop WPF allows you to navigate deeper into Elements’ properties (like Parent, Child, TemplatedParent, just any) by clicking right mouse-button and choosing “Delve” menu item (there is a button in the UI to dive back to the original place you dive from), highlighting recently changed properties with a different color, hiding default values of element’s properties, modifying property values in run-time, hence making a bit of debugging. WPF Inspector provides trigger debugging etc. They do not only help you understand better your visual tree, but really save much time. Okay, that’s all about singing the praises.

About 3 months ago I unexpectedly discovered that Snoop can’t snoop my Excel add-ins, complaining that it couldn’t find Root Visual. Of course, I immediately googled (several people tried snoop under the same environment, but with no luck, and developers of the Snoop didn’t have accurate clue what might have been wrong). Then I downloaded Snoop’s sources from codeplex and tried to debug (debugging Snoop is a separate topic, and it’s mainly not about attaching to Snoop.exe process, but attaching to the process of the application you gonna snoop, the details will follow later). The basics of finding the Root Visual by Snoop are the following: if Application.Current != null, then this Application.Current is Root Visual, otherwise it is taking the RootVisual from the first PresentationSource in the collection PresentationSource.CurrentSources. Of course, for excel add-ins we always have Application.Current = null (unless you gonna explicitly define it). You know what was strange? The collection PresentationSource.CurrentSources was always empty. Those days I couldn’t get the reason behind it. Fortunately, we had a working TestHarness (a regular WPF application with the contents of our add-in), and I successfully used it with Snoop. And gave up.

About three weeks ago I faced one more problem with disappearing styles of ContextMenu. I blogged about it recently. This was where I needed Snoop very much. I knew it wouldn’t work, but downloaded the latest version, and indeed it couldn’t find my Root Visual again. I solved my problem without Snoop WPF, but spent too much time I didn’t wish to spend.

That was the last drop which pushed me to research. Before starting my research, I created an artifact in Snoop WPF’s Issue Tracker. You can check it here. It is always better if original developers fix smth, because they know their codebase much better than we do. Luckily, Snoop WPF’s sources are made in a readable manner, it won’t take you much time to get into the basics of what’s going on there. By the time I got the first meaningful answer (which happened on Oct 3, 2011 from Maciek Rakowski), I already had the working solution, and mentioned it in my answer on Oct 4, 2011. But still I didn’t reject the fact the guys might make it better than I did. Hence, for a week or two I was accepting several solutions from Maciek and checking them. Unfortunately, I couldn’t share the source of our add-in with Maciek, because it is proprietary software (we are not licensed to open the code).

How is Snoop WPF actually snooping?

Of course, you have to understand that to access too many properties of the elements in the process memory, we have to be “inside of” the process, hence we need to inject there somehow. The development team of Snoop WPF created a genius solution to this problem. To understand it, one should have some basic knowledge in native (Win 32) programming. They have defined a custom Window Message (WM_GOBABYGO) to be sent to the process’s window. And they installed a WH_CALLWNDPROC-type hook (thread-specific to the app’s main window thread) to handle this message. The main trick happens here: when the Hook Procedure is triggered, we are already “inside of” the process we wanna inject into. From here you can do some things inside of the process, e.g. load your Snoop.exe assembly etc.

But how (being inside of a foreign process) would we know what assembly should we load? What method should we execute to snoop? Sending a message in Win32 allows you to pass two parameters with the message (actually they are pointers – 4 bytes or 8 bytes each, depending on the bitness of your OS). Of course, you can’t store long strings or structures in such small pieces of memory. That’s right, they are designated for passing some target  addresses in the heap. So before actually sending a message we should write some information into process’s memory and pass the address of this information in a parameter together with WM_GOBABYGO. All this logic is implemented in a .NET-assembly ManagedInjector***.dll. The assembly is written in C++ .NET and involves some native code, which is not often convenient to do from C#.

Actually we’re greatly dependent on the .NET-runtime version of the application we’re snooping. Hence, for .NET 3.5 (runtime .NET 2.0) and for .NET 4.0 we should have separate assemblies, and for x86 and x64 platforms they must be different too. So, we have ManagedInjector32-3.5.dll, ManagedInjector64-3.5.dll, ManagedInjector32-4.0.dll, ManagedInjector64-4.0.dll. Actually, to run the proper dll, Snoop WPF has a small console Launcher-program which must be targeted the proper version of .NET runtime. So, there are also four versions named respectively: ManagedInjectorLauncher32-3.5.exe, ManagedInjectorLauncher64-3.5.exe, ManagedInjectorLauncher32-4.0.exe, ManagedInjectorLauncher64-4.0.exe. This launcher program accepts assembly-path, class name and method name as its command-line parameters, and later passes them to the Inject()-method in ManagedInjector***.dll.

So, Snoop chooses which ManagedInjectorLauncher***.exe it should launch for the particular app. This logic is contained in Injector.cs. You can see there: it checks .NET library versions the application references, and treats only the maximal ones.

So, if your application or AppDomain runs under a lower version of .NET, you should refer to Launch()-method in Injector.cs.

var file = Path.Combine(directory, "ManagedInjectorLauncher" + Suffix(windowHandle) + ".exe");

 And adjust file-variable either in run-time or hardcode a value instead of Suffix().

NOTE: this should be done when attaching debugger to Snoop.exe directly, because at this step we’re not yet inside of target application’s process.

To sum up:

1) Launch Inject()-method from the ManagedInjector***.dll targeted at proper version of run-time.

2) From that method write our strings, containing assembly path, class name, method name into target process’s memory via VirtualAllocEx + WriteProcessMemory from Windows API.

3) Install a WH_CALLWNDPROC-type hook specific to process’s main window thread.

4) Send WM_GOBABYGO with the address of our strings as a parameter.

5) Inside of the hook procedure, when WM_GOBABYGO comes, read strings from the process memory. Load the proper Snoop.exe-assembly and execute the method (via reflection).

6) Uninstall windows hook and free memory we’ve allocated for strings.

How did I come to AppDomains in Excel add-ins? What are the AppDomains? How to work with them?

Well, while examining this notorious PresentationSource.CurrentSources, I was always curious how couldn’t my presentation source get into this collection. I researched the source code of Microsoft libraries a bit, and it appeared that every created PresentationSource (or its inheritor, e.g. HwndSource) should get into this collection definitely. This confused me even more. I decided that it was something wrong with the CustomTaskPane we are hosting all our stuff inside of. Hence, to reach the custom task pane’s content, I wanted to examine Application.CommandBars collection to get my actual task pane, and from there grab the Windows Form’s ElementHost, where we are hosting all our WPF stuff. But for this purpose I had to retrieve Excel.Application object by its hWnd etc. A crazy idea, isn’t it? Fortunately, I gave it up.

I mustn’t but say we are developing Shim add-ins, not VSTO add-ins. Hence I examined the code written by some of my colleagues some time ago and rediscovered that they are creating a separate AppDomain fro each of the add-ins.. I was quite new to AppDomains, but I knew that they represent isolated “spaces” inside my running process, and vice versa: several applications can exist inside one AppDomain. Here I got an assumption that I have an empty collection PresentationSource.CurrentSources, because it might be a single collection on the AppDomain-basis, not the Application-basis…

Luckily the assumption appeared to be true. But it had already taken me 1.5 days to come to this very canonic but important idea.

Firstly, I thought I could easily take an object from another domain, examine it etc etc, and the task would be over. Unfortunately, I forgot about the whole conception of isolation, which is very basic to AppDomains. Even in the Visual Studio debugger the objects from another app domains would be represented (at best) as instances of some TransparentProxy class… So you can’t just simply play with smth from another AppDomain. Believe me or not, I couldn’t even enumerate the loaded assemblies of the other app domain unless I was “inside of” it. The details will follow.

First of all, how can you look what’s happening inside of all the domains of your application? I really mean, at least to see what assemblies are loaded in each of the domains. For this purpose one may use WinDbg. It can give you a good dump of domains of the application you attach it to. Again, please be very careful with the bitness of your WinDbg. By default, when installing Windows SDK, it will install x64-version of WinDbg on x64 OS. Please refer to this page to read how to install a 32-bit version. E.g. I’m having a 32-bit Excel, hence x64-version was good for nothing in some scenarios. And vice versa – for x64 bit applications use only the x64 bit version.

After installing and running WinDbg, you should attach it to your target process, then go to a command window and run there:

.loadby sos clr

This command should load SOS Debugging Extension. If you used the proper WinDbg version, no error messages will be shown. Than just type the next command:

!DumpDomain

I hope you are able to see the list of domains with loaded assemblies in each of them.

The next word is about AppDomains’ intercommunication. Specialists know it better, but I will name at least three points:

1) AppDomain.DoCallBack() method.

2) An inheritor of MarshalByRefObject.

3) Serializable objects can be serialized and deserialized automatically when passing them across domains.

The problem with AppDomains and a possible solution.

What was the problem I faced when using several app domains? Trying to get regular objects from the other AppDomain, you will likely encounter an exception that they are not marked as Serializable. So you are automatically bound to an approach of either using DoCallBack() or creating an inheritor of MarshalByRefObject. I used the second, but actually for the case described below there is little difference in choosing an approach, because you will face one and the same problem: your other domains know nothing about the assembly Snoop.exe.

So, for the current AppDomain (I mean the one you are “inside of”) you can use a beautiful method:

System.Reflection.Assembly.LoadFile("C:\\Temp\\My.Coolest.Assembly.dll");

which is basically used in the Windows Hook Procedure mentioned earlier in this post. But we don’t have such possibilities to Load any assembly by its location into another AppDomain, unless we are forcing it through some proxy-class to call Assembly.LoadFile(). Several approaches have been discussed e.g. here. But all of them involve the following fact: we’ve just created our new AppDomain, and this AppDomain is aware of our current assembly, i.e. of all the proxy classes we declared or AssemblyResolve callbacks we subscribe it to. This is not our case. We obtained our domain from “nowhere”, we were not regulating its creation, setting probing paths for assemblies etc. I really, really didn’t want to put anything into GAC. I tried to put Snoop.exe in AppDomain’s base directory, and it even required it to be in Excel directory (however I had to reject LoadFile() and changed it to Load() because of different Load-contexts). It began showing some signs of success. However after having multiple instances of the assembly and trying to back-cast to a type:

var someInstance = (SomeType) appDomain.CreateInstanceAndUnwrap(
typeof(SomeType).Assembly.FullName, typeof(SomeType).FullName);

I was getting an error of inability to cast.

Of course, I decided not to put Snoop.exe into GAC. Instead, I defined a small piece of code, put it into a class in a separate assembly and installed it into GAC. Then every AppDomain will have no trouble resolving it. If you know any good approach without using GAC, you are welcome to comment.

How do I get to know what AppDomains do I have in my process?

Alright, it was nice to examine the ones with WinDbg, but now we actually need to enumerate AppDomain objects in our application. Unfortunately, .NET does not allow to do it through its APIs, and we are obliged to use some solutions dealing with COM. I used the one proposed here:

    using System.Runtime.InteropServices;
    // Add the following as a COM reference - C:\WINDOWS\Microsoft.NET\Framework\vXXXXXX\mscoree.tlb
    using mscoree;

    public static IList<AppDomain> GetAppDomains()
    {
        IList<AppDomain> _IList = new List<AppDomain>();
        IntPtr enumHandle = IntPtr.Zero;
        CorRuntimeHostClass host = new mscoree.CorRuntimeHostClass();
        try
        {
            host.EnumDomains(out enumHandle);
            object domain = null;
            while (true)
            {
                host.NextDomain(enumHandle, out domain);
                if (domain == null) break;
                AppDomain appDomain = (AppDomain) domain;
                _IList.Add(appDomain);
            }
            return _IList;
        }
        catch (Exception e)
        {
            Console.WriteLine(e.ToString());
            return null;
        }
        finally
        {
            host.CloseEnum(enumHandle);
            Marshal.ReleaseComObject(host);
        }
    }

You see, it required referencing C:\WINDOWS\Microsoft.NET\Framework\vXXXXXX\mscoree.tlb. I don’t know anything about version compatibility, but with the version taken from  .NET4 (x86) I was able to see not only target app domains, but also AppDomain running .NET 3.5 (tried with test VSTO add-in) and reach the collection PresentationSource.CurrentSources. When speaking about .NET 3.5, I mean, of course, adjusting path in Injector.cs, because my other add-ins are running under .NET 4. If you have any troubles with enumerating app domains, try playing with different tlb… I suppose they are pretty similar but not identical (file sizes are different).

What was bad about this solution, Visual Studio created a .NET assembly named Interop.mscoree.dll. Of course, it didn’t have any strong name. But I required to put my dll into GAC, hence I couldn’t reference any library that doesn’t have strong name. Luckily Interop.mscoree.dll didn’t contain any logic, just COM classes declarations. So I simply extracted these ones and put into a namespace within my assembly:



The reference to Interop.mscoree.dll was gone. The only issue I encountered with this code – the thread, where enumerating of AppDomains happens, should be STA. Hence, explicitly created such thread.

All about making Snoop cross App Domain

Then I made some adjustments in the Snoop’s code base. In particular, I decided not to throw a message if Root Visual wasn’t found for an AppDomain. Instead, I’m throwing it only in case all App Domains couldn’t provide any presentation source. My desire was not to make any great changes to the Snoop’s codebase. So the only changes I allowed me were:

1) converting some SnoopUI void-methods to return bool (to indicate whether they succeeded or not),

2) adjusting Injector methods to call an assembly from GAC before calling Snoop.exe methods,

3)  adding some stuff for Win32 <-> WPF Interop (the necessity of this will be described later).

That’s it. All the AppDomain-handling logic was gone to a separate assembly in GAC, in order not to pollute Snoop-assembly.

I’m just providing the code of CrossDomainSnoop–class.

using System;
using System.Collections.Generic;
using System.IO;
using System.Reflection;
using System.Runtime.InteropServices;
using System.Threading;
using System.Windows;
using SnoopRunner.Mscoree;

namespace SnoopRunner
{
    public class CrossDomainSnoop : MarshalByRefObject
    {
        public void CrossDomainGoBabyGo(string snoopAssemblyLocation, string snoopUiTypeName, string methodName)
        {
            _snoopAssemblyLocation = snoopAssemblyLocation;
            _snoopUiTypeName = snoopUiTypeName;
            _methodName = methodName;

            var threadSTA = new Thread(EnumAppDomains);
            threadSTA.SetApartmentState(ApartmentState.STA); //STA is required when enumerating app domains
            _evt = new AutoResetEvent(false);
            threadSTA.Start();
            _evt.WaitOne();

            bool succeeded = false;
            if (_appDomains == null || _appDomains.Count == 0)
            {
                var result =
                    MessageBox.Show
                        (
                            "Snoop wasn't able to enumerate app domains. Do you want to run it in a single-app domain mode?",
                            "Enter Single App Domain Mode",
                            MessageBoxButton.YesNo,
                            MessageBoxImage.Question
                        );
                
                if (result == MessageBoxResult.Yes)
                {
                    succeeded = RunGoBabyGo(_snoopAssemblyLocation, _snoopUiTypeName, _methodName);
                }
            }
            else
            {
                foreach (var appDomain in _appDomains)
                {
                    try
                    {
                        var crossDomainSnoop =
                            (CrossDomainSnoop)
                            appDomain.CreateInstanceAndUnwrap(typeof(CrossDomainSnoop).Assembly.FullName,
                                                            typeof(CrossDomainSnoop).FullName);
                        //runs in a separate AppDomain
                        var appDomainSucceeded = crossDomainSnoop.RunGoBabyGo(_snoopAssemblyLocation, _snoopUiTypeName, _methodName);
                        succeeded = succeeded || appDomainSucceeded;
                    }
                    catch (FileNotFoundException e)
                    {
                        //TODO: handle not found;
                    }
                }
            }

            if (!succeeded)
            {
                MessageBox.Show
                    (
                        "Can't find a current application or a PresentationSource root visual!",
                        "Can't Snoop",
                        MessageBoxButton.OK,
                        MessageBoxImage.Exclamation
                    );
            }
        }

        private void EnumAppDomains()
        {
            _appDomains = GetAppDomains();
            _evt.Set();
        }

        //intended to run in a separate appdomain
        public bool RunGoBabyGo(string location, string typeName, string methodName)
        {
            var assembly = Assembly.LoadFrom(location);
            if (assembly != null)
            {
                Type type = assembly.GetType(typeName);
                if (type != null)
                {
                    MethodInfo methodInfo = type.GetMethod(methodName, BindingFlags.Static | BindingFlags.Public);
                    if (methodInfo != null)
                    {
                        return (bool) methodInfo.Invoke(null, null);
                    }
                }
            }
            return false;
        }

        private static IList<AppDomain> GetAppDomains()
        {
            IList<AppDomain> result = new List<AppDomain>();
            IntPtr enumHandle = IntPtr.Zero;
            CorRuntimeHostClass host = new CorRuntimeHostClass();
            try
            {
                host.EnumDomains(out enumHandle);
                object domain = null;
                while (true)
                {
                    host.NextDomain(enumHandle, out domain);
                    if (domain == null) break;
                    AppDomain appDomain = (AppDomain)domain;
                    result.Add(appDomain);
                }
                return result;
            }
            catch (Exception e)
            {
                Console.WriteLine(e.ToString());
                return null;
            }
            finally
            {
                host.CloseEnum(enumHandle);
                Marshal.ReleaseComObject(host);
            }
        }

        private IList<AppDomain> _appDomains;
        private AutoResetEvent _evt;
        private string _snoopAssemblyLocation;
        private string _snoopUiTypeName;
        private string _methodName;
    }
}
By the time I got it working, I thought that was all. But that wasn’t Smile.

Why are my key strokes passing by my WPF window and go directly to Excel?

Why are my WPF ContextMenus and ComboBoxes disappear immediately after showing in Excel?

I’m very glad we have all this WPF stuff right now, because dealing with native programming is a mess and slows down someone’s progress considerably. It’s all about imitating ElementHost.EnableModelessKeyboardInterop(); and more.

Here I had my Snoop Window appearing alright, and noticed two basic problems:

1) All the key strokes were going directly to Excel Window passing by our WPF Window.

2) Context Menus and Combo Boxes were disappearing immediately after showing.

Of course, I googled much on both topics. The first one was very similar to the scenario where WPF is hosted in WinForms application. For this case Microsoft has method ElementHost.EnableModelessKeyboardInterop(). I understood I need smth similar for the scenario when WPF is hosted inside a Win32 application.

The second problem was brightly described here.

At least I was frustrated by not being able to run my favorite “Delve”-command from the context menu on right button click.

The circumstances changed the interesting way: in our add-in (WPF is hosted inside Custom Task Pane) we faced some problems when clicking ContextMenu-items that overlap Excel window (I mean when menu items are located out of Custom Task Pane bounds). The issue is described here, the working Smile workaround is described here. However we checked that the workaround works only if WPF content has focus. So we decided to test whether we can substitute Custom Task Pane with a regular WPF Window in the scenarios where we do not need panel-docking. We couldn't have checked menu clicking, because we faced the two issues with WPF Window I’m speaking about. Thus, my working tasks & my researchive tasks overlapped a bit.

We noticed that we are not facing problems with menus or with keyboard in these cases:

1) Hosting WPF content inside Window Form (but its old-fashioned styling does not encourage us).

2) Running WPF Window on a separate UI-thread (but multi threading on UI is evil).

So I began my investigation on what’s wrong and how to make it work on the same UI thread as the Excel’s UI thread.

Of course, we can read some nice guidelines on Hosting WPF in Win32 applications and some more. But again, as we were not controlling the creation of the Excel Window and (more importantly) WPF HwndSources, all we have is just windows hooking.

To solve keyboarding issue I looked into Microsoft’s implementation of ElementHost.EnableModelessKeyboardInterop(). Unfortunately the implementation details appeared to be empty in Reflector Surprised smile, but fortunately the sources were available on the internet after a bit googling around. What they basically do there is creating an implementation of WinForm’s IMessageFilter to filter keyboard messages and send them to WPF via ComponentDispatcher.RaiseThreadMessage(). They add this filter to the WinForms  Application’s collection of Message Filters. This is alright except for the fact that  I do not have any WinForms Application Smile. Looking deeper what they are doing with filters, it appeared that they install a windows hook, and for handled messages they are setting  message descriptor to zero, hence the message becomes WM_NULL and is not a keyboard message any more. They are lucky WH_GETMESSAGE-hook allows message modifications, which is not true about some other hooks. So I just repeated this behavior:

private void ProcessMessage(ref Win32.Message message)
{
    if (!_window.IsActive || message.hWnd != _windowHandle)
        return;

    switch (message.msg)
    {
        case Win32.Messages.WM_KEYDOWN: //0x100  
        case Win32.Messages.WM_KEYUP: //0x101 
        case Win32.Messages.WM_CHAR: //0x102  
        case Win32.Messages.WM_DEADCHAR: //0x103 
        case Win32.Messages.WM_SYSKEYDOWN: //0x104 
        case Win32.Messages.WM_SYSKEYUP: //0x105 
        case Win32.Messages.WM_SYSCHAR: //0x106  
        case Win32.Messages.WM_SYSDEADCHAR: //0x107 
            var interopMsg = new MSG
            {
                hwnd = message.hWnd,
                message = message.msg,
                wParam = message.wparam,
                lParam = message.lparam,
                pt_x = 0,
                pt_y = 0,
                time = Win32.GetMessageTime()
            };
            var messageCopy = new Win32.Message { hWnd = message.hWnd, lparam = message.lparam, msg = message.msg, wparam = message.wparam };
            //prevent further propagating of the source message, we don't want hosting environment
            //to receive it (or it will start typing characters into Excel window)
            message.msg = 0;
            //since now there is noone else to translate key messages into character messages for us,
            //we are doing it ourselves (WM_KEYDOWN, WM_KEYUP, WM_SYSKEYDOWN, WM_SYSKEYUP to WM_CHAR)
            Win32.TranslateMessage(ref messageCopy);
            ComponentDispatcher.RaiseThreadMessage(ref interopMsg); //notify WPF environment about our message
            break;
    }
}

That’s pretty much all for regular Win32 applications. You will unlikely see ContextMenu or ComboBox issue there. But for Excel we require some more work.

What about the menus and ComboBoxes? WPF ComboBox dropdowns and ContextMenus (as well as ToolTips etc) are displayed in WPF Popup. Popup is capturing the mouse to be able to know when mouse click outside captured element happens – this indicates that popup will be closed (unless you explicitly set StaysOpen = true;). Excel is known to be very “greedy” to the focus, hence when popup is opened, Excel is continuously trying to grab the focus back (at least when the mouse is over WPF window). Strange enough, it doesn’t happen on a separate UI thread or when hosting WPF stuff inside WinForm’s ElementHost. When WPF receives a windows message which requires giving the focus back, the popup is closed. Using Spy++ you can observe it happening the following way:

1) For ContextMenu – the popup’s newly created native window (HwndWrapper) is getting “bombed” with WM_CAPTURECAHNGED.

2) For ComboBox – the original window (HwndWrapper ) is getting “bombed” with WM_CAPTURECAHNGED.

I wish I knew how to prevent Excel from doing this. But as a workaround (which might be very straightforward and even  dangerous) I just temporarily subclass the corresponding Window’s main procedure and disregard this continuous WM_CAPTURECAHNGED until the popup is not closed. This was done via WH_CALLWNDPROC-type hook.

private int WndProcHook(int nCode, IntPtr wParam, ref Win32.CWPSTRUCT lParam)
        {
            if (nCode >= 0)
            {
                //ShowWindow(true)
                if (lParam.msg == Win32.Messages.WM_SHOWWINDOW && lParam.wparam.ToInt32() != 0)
                {
                    var hwndSource = HwndSource.FromHwnd(lParam.hWnd);
                    //check this is WPF popup
                    if (hwndSource != null && hwndSource.RootVisual.GetType().Name.Contains("PopupRoot"))
                    {
                        ControlType controlType;
                        var parentHwndSource = GetParentHwndSource(hwndSource, out controlType);
                        //we must handle only popups related to our Window, not any others
                        if (parentHwndSource != null && parentHwndSource.Handle == _windowHandle)
                        {
                            switch (controlType)
                            {
                                case ControlType.ContextMenu:
                                    _contextMenuPopupHandle = lParam.hWnd;
                                    SubclassWndProc(_contextMenuPopupHandle); //contextMenu is getting bombed with WM_CAPTURECHANGED
                                    break;
                                case ControlType.ComboBox:
                                    _comboBoxPopupHandle = lParam.hWnd;
                                    SubclassWndProc(_windowHandle); //window itself is getting bombed with WM_CAPTURECHANGED
                                    break;
                            }
                        }
                    }
                }
                //ShowWindow(false) - when ContextMenu closes we may come here
                else if (lParam.msg == Win32.Messages.WM_SHOWWINDOW && lParam.wparam.ToInt32() == 0 && _contextMenuPopupHandle == lParam.hWnd)
                {
                    RestoreWndProc(_contextMenuPopupHandle);
                }
                //DestroyWindow() - when ComboBox closes we'll come only here,
                //after ContextMenu is closed we'll also come here after ShowWindow(false), we are checking the hWnd
                else if (lParam.msg == Win32.Messages.WM_DESTROY && (_comboBoxPopupHandle == lParam.hWnd || _contextMenuPopupHandle == lParam.hWnd))
                {
                    if (_comboBoxPopupHandle == lParam.hWnd)
                        RestoreWndProc(_windowHandle);
                    Win32.SendMessage(_windowHandle, Win32.Messages.WM_CAPTURECHANGED, IntPtr.Zero, IntPtr.Zero);
                    _comboBoxPopupHandle = IntPtr.Zero;
                    _contextMenuPopupHandle = IntPtr.Zero;
                }
            }
            return Win32.CallNextHookEx(_hMessageProcHook, nCode, wParam, ref lParam);
        }

        private enum ControlType { Other, ContextMenu, ComboBox }

        private static HwndSource GetParentHwndSource(HwndSource popupBasedHwndSource, out ControlType controlType)
        {
            controlType = ControlType.Other;

            var parentPopup = (Popup)((FrameworkElement)popupBasedHwndSource.RootVisual).Parent;
            var parentHwndSource = (HwndSource)PresentationSource.FromVisual(parentPopup);
            
            if(parentPopup.TemplatedParent != null && parentPopup.TemplatedParent is ComboBox)
                controlType = ControlType.ComboBox;

            //NOTE: for ContextMenu & ToolTip - popup is not represented in the HwndSource
            if (parentHwndSource == null)
            {
                if (!(parentPopup.Child is ContextMenu))
                    return null; //we don't want to handle ToolTip etc.

                parentHwndSource = (HwndSource)PresentationSource.FromVisual(parentPopup.PlacementTarget);
                controlType = ControlType.ContextMenu;
            }
            return parentHwndSource;
        }

        private IntPtr CustomWndProc(IntPtr hWnd, int msg, IntPtr wParam, IntPtr lParam)
        {
            if (msg == Win32.Messages.WM_CAPTURECHANGED && lParam.ToInt32() == 0)
            {
                Mouse.Capture(null, CaptureMode.None); //just to be a bit friendly with Excel's expectations
                return IntPtr.Zero;
            }
            if (_subclassedWindowData == null)
                return IntPtr.Zero;
            return Win32.CallWindowProc(_subclassedWindowData.PrevHwndProc, hWnd, msg, wParam, lParam);
        }

        private void SubclassWndProc(IntPtr hWnd)
        {
            Win32.WndProc hookProcDelegate = CustomWndProc;
            var prevWndProc =
                (IntPtr)Win32.SetWindowLong(hWnd, Win32.WindowAttributes.GWL_WNDPROC, (int)Marshal.GetFunctionPointerForDelegate(hookProcDelegate));
            _subclassedWindowData = new SubclassedWindowData { PrevHwndProc = prevWndProc, HookProcDelegate = hookProcDelegate };
        }

        private void RestoreWndProc(IntPtr hWnd)
        {
            if (_subclassedWindowData == null)
                return;

            Win32.SetWindowLong(hWnd, Win32.WindowAttributes.GWL_WNDPROC, _subclassedWindowData.PrevHwndProc.ToInt32());
            _subclassedWindowData = null;
        }

We also decided to enable typing into Excel window when you are doing a single mouse click on some cell or other excel area, and previously your focus was in WPF Window. We’re marking WPF Window as inactive in this case, hence keyboard hook behaves differently (your window.IsActive will become false) . This was done with WH_MOUSE-type hook:

private int MouseProcHook(int nCode, IntPtr wParam, ref Win32.MouseHookStruct lParam)
{
    switch (wParam.ToInt32())
    {
        case Win32.Messages.WM_LBUTTONDOWN:
        case Win32.Messages.WM_LBUTTONUP:
        case Win32.Messages.WM_LBUTTONDBLCLK:
        case Win32.Messages.WM_RBUTTONDOWN:
        case Win32.Messages.WM_RBUTTONUP:
        case Win32.Messages.WM_RBUTTONDBLCLK:
        case Win32.Messages.WM_MBUTTONDOWN:
        case Win32.Messages.WM_MBUTTONUP:
        case Win32.Messages.WM_MBUTTONDBLCLK:
            if (lParam.hwnd != _windowHandle) //clicked outside our window, we should deactivate it, so _window.IsActive will become false
                Win32.SendMessage(_windowHandle, Win32.Messages.WM_ACTIVATE, IntPtr.Zero, IntPtr.Zero);
            break;
    }
    return Win32.CallNextHookEx(_hMessageProcHook, nCode, wParam, ref lParam);
}

I actually put all of this “native”-stuff into one class named NativeMessagesInterceptor. But it has regions, so you can easily separate all the logic according to your needs, at least into several classes according to the hook-types.

The source code

The code is provided as is. It wasn’t tested very well etc. It’s just fresh and raw Smile. You may need to adjust the PostBuild event of SnoopRunner project, because it now installs into GAC with the following command:

"$(FrameworkSDKDir)bin\gacutil.exe" /i $(TargetPath) /f

One of  VC ++ .NET projects was not building properly on my machine, I unloaded it from the solution, you may need to load it back.

I also included all the Tortoise SVN (ver 1.6.x.x) payload, so you can easily see what was changed by using Diff.

DOWNLOAD (~27.1 MB)

Best regards,
Sid.

Tuesday, October 11, 2011

Finding out element style sources in WPF

Hi everyone passing by to look through this blog. Currently I’m working on the project where main technology is WPF. We are developing add-ins for MS Excel. Recently I noticed some style weirdness with the ContextMenu-controls.
We are hosting WPF content in a Custom Task Pane through WinForm’s ElementHost. It basically has two tabs currently. In one of the tabs we’re hosting a legacy control:

Image 1

You are able to see a stylized context menu in the picture above (the white menu in the right lower corner). However if I’m switching to Tab2 and then returning back to Tab1, smth is getting changed in the order of our legacy libraries loading (I don’t know what exactly). Some other styles are getting higher priority. Weird enough that all the styling is preserved… except for the ContextMenu. This one is looking unstylized:

Image 2

In my previous working experience I was using Snoop WPF to detect styling etc. However in my current environment this great tool is not working. We were having a TestHarness to run our content outside of MS Excel environment (in a simple WPF application), but unfortunately as the time was passing by we gave up supporting it unintentionally because of the rare usage.

I found myself in a very unpleasant situation. At first I tried to look through style usages in our huge application. It hasn’t led me to finding this canonic style for ContextMenu. Then full-text search across all solution files followed… Too many places to examine, too much time spent & again no luck. What would one do next?

I can’t but say: in this legacy code they didn’t have any specific style applied to this context menu inline (explicitly), hence it relied on some common style for this control defined somewhere up the visual tree in the resources.

I noticed that I was having too many TextBoxes (actually they are custom controls) neighboring the elements which execute this Context Menu. So I decided to handle these custom TextBoxes’ GotFocus-event before and after Tab-switching. Inside the handler I was able to see what this ContextMenu default style is getting resolved into. To obtain resources in WPF we have a wonderful method FindResource(). The documentation on MSDN is too minimalistic (as always), it relies on all developers reading books and use MSDN just as a reference source. And the example in this article also tells nothing. By these words I really mean they don’t deeply touch the topic of what can be passed in as resourceKey. It is the same as we do in XAML, i.e. it can be any of string (resource name), ComponentResourceKey or System.Type (the default resource for the type), perhaps even more.

var defaultStyle = (Style)this.FindResource(typeof (TextBlock));
var someStyle = (Style)this.FindResource("SomeStyle");
var componentResourceKey = new ComponentResourceKey();
var someComponentStyle = (Style)this.FindResource(componentResourceKey);

Alright, here is what I had in my case:

this.GotFocus += OnControlGotFocus;

private void OnControlGotFocus(object sender, RoutedEventArgs e)
{
    var defaultStyle = (Style)this.FindResource(typeof (ContextMenu));
}

After the first running I discovered that before switching tabs, defaultStyle had been resolved to the proper style (as expected), then after switching tabs and returning to the first Tab, it was resolved to default style from PresentationFramework.Aero assembly (!!!). This confused me even more, because it didn’t look like applying some new style but rather like resetting the applied one.

Now I was obliged to discover where (the hell) it takes this default style from. I obviously had to debug inside of .NET’s FindResource() method. I don’t know much about current situation with version compatibility of debug symbols & sources from Microsoft public servers with the locally-installed .NET framework. But I remember the times of Visual Studio 2008 & .NET Framework 3.5 SP1, it was downloading some debug symbols and sources, but (in my case) they appeared not to be compatible with installed libraries. Hence I decided to use Reflector VSPro (I have a trial version) directly to generate pdb and sources out of PresentationFramework.dll. Debugging appeared to be rather convenient except for the fact that the library was compiled as “optimized”, so you can see almost nothingSmile in the debugger. Fortunately, some important information is still available.




In the picture above: I stepped in here by pressing F11, hence I know what actually is this optimized expression.

So it goes to the FindResourceInternal() method where all the magic happens:



From FindResourceInTree() method we are navigating to the FindResourceOnSelf() method.



Here my breakpoint didn’t get hit (perhaps because of optimizations) :



So I found myself directly inside of dictionary.Contains()



As one might notice, this is the main method I’m gonna debug. It is here were I’ll get to know what dictionary returned me the weird resource.

So when hovering over dictionary instance, you’ll see that almost all of its fields cannot be evaluated because of the optimizations. But the field named _source is luckily accessible:



You see that m_String contains all we need. So I personally added it to my watches list to be always up-to-date:



Okay, finally flag got the value of true in the dictionary.Contains() method:



Please pay attention to the Watches list. I couldn’t believe my eyes!! Someone managed to add PresentationFramework.Aero’s dictionary reference to the collection of merged dictionaries… But it appeared to be true when I opened the corresponding xaml-file:

<ResourceDictionary.MergedDictionaries>
    <ResourceDictionary Source="pack://application:,,,/PresentationFramework.Aero;V3.0.0.0;31bf3856ad364e35;component/themes/aero.normalcolor.xaml" />  
    <ResourceDictionary Source="pack://application:,,,/WPFToolKit;V3.5.40128.1;;component/themes/aero.normalcolor.xaml" />
</ResourceDictionary.MergedDictionaries>


How did I get to this dictionary?
This resource dictionary (b.t.w.) was taken from the following path:
1) our control was inheriting from one control defined in another library.
2) that other library had some dictionary references in its generic.xaml
3) climbing through those references two libraries up, I found the target resource dictionary.
A bit sophisticated isn't it? This way debugging helped me greatly.

Fortunately, the styles from these dictionaries were not required in my code base, I removed these merged dictionary references & the problem was gone. Still, I don’t know why the styles were resolved the correct/desired way on the first load, but at least I managed to find the source of the weird styles.

Regards,
Sid.

UPDATE 1. I think I have a clue why this restyling happens. Each time the control is activated (I mean after switching tabs), it might be reconstructing all of its child UI-objects, and by that time the assembly with undesired styles may have already been loaded.
UPDATE 2. There is a way to disable code optimizations. Unfortunately I couldn't have found it by the time I was writing this post. Please refer to Cory Plotts’ post and of course read his details on debugging .NET source code with Reflector VS Pro.