Advanced Search
Welcome to Omgili,
Omgili (Oh My God I Love It ;) is a search engine for discussions. With Omgili you can find answers and solutions, debates, discussions, personal experiences, opinions and more... To learn more about Omgili click here.

This is a complete preview of the discussion as it was indexed by Omgili crawlers. Use this preview if the original discussion is unavailable.
Click here to view the original discussion.

Finding memory leaks using windbg

Hello everybody, I'm trying to find a memory leak in a dotnet-app using windbg and the sos-extension.

Under some rare circumstances, the app allocates ~1.4 GB and then dies with an OutOfMemory-Exception.

However, this happens rarely and so the bug is hard to track down. I can see that the heap is littered with Byte-Arrays when the OOM-Exception is thrown.

Using gcroot I can see that those byte-arrays are used by (and only by) OracleDataReader-Objects. However, when I search for the roots of those OracleDataReaders I get the output: DOMAIN(00166420):HANDLE(WeakSh):aa3400:Root:3a0b0e60(Devart.Data.Oracle.OracleDataReader)-> 5a6df040(System.Byte[]) which seems to only tell me what I already know: that the DataReaders are using the bytearrays.

But I can't see who's referencing the datareaders.

Why aren't those objects gc'ed?

Is there any way to find out? And yes, I *am* calling Dispose on all the datareaders ;) Any help with this would be highly appreciated! Cheers Stefan

I think you running toooo low for memory leaks detection but in any case - you should ask this person - may be he knows:  http://blogs.msdn.com/alejacma/default.aspx

Is this an app written in VB.NET?

Did you ship the Debug build?

Hello again, thanks for your replies.

As for your question nobugz: The dump is from a debug build, the app is written in C#.

When I ran into the OOM-Exception I attached WinDbg and saved the dump. I looked a little bit further and found the following: For some of the OracleDataReader-objects !gcroot results in the output Finalizer queue:Root:39bcd118(Devart.Data.Oracle.OracleDataReader) Looking at the output of !finalizequeue I discovered the following: SyncBlocks to be cleaned up: 6 MTA Interfaces to be released: 0 STA Interfaces to be released: 0 generation 0 has 2069 finalizable objects (0898b350->0898d3a4) generation 1 has 42 finalizable objects (0898b2a8->0898b350) generation 2 has 17118 finalizable objects (0897a730->0898b2a8) Ready for finalization 10115 objects (0898d3a4->089971b0) Although I'm not sure how to interprete this I'm quite sure it's not a healthy condition.

I already found this article on the web, it seems to be related to my problem: http://blogs.msdn.com/tess/archive/2007/10/19/net-finalizer-memory-leak-debugging-with-sos-dll-in-visual-studio.aspx Though most of the symptoms in this article do not apply in my case :( If anybody has an idea where to look next I'd be happy to hear it since I'm (obviously) pretty new to this. Cheers Stefan

I'm still looking into this.

The execution of the finalizerthread seems to be stuck in a WaitForSingleObject()-call, this seems to be the root of the problem.

After googleing some more I found this http://mcfunley.com/355/some-twists-on-blocked-finalizers Looks a lot like the situation I got here. The solution suggested on the the blog (decorating Main() with [MTAThread]) doesn't work for me since it causes an exception when calling Show() on a form that's sitting on another form ("Drag and Drop registration failed;

Current thread must be STA", something like that) :( Any ideas?

Sound to me like a flaw in the Oracle provider.

Blocking in the finalizer is about as evil as it gets.

There's a 2 second timeout on the finalizer thread, that could explain the mass of unfinalized objects you've got.

Religiously using the Dispose() method should help to relieve the pressure on the finalizers.

Making the UI thread MTA is illegal in programs that create windows.

To get support for Oracle provider problems you probably need an Oracle support forum.

What makes you think it's the OracleDataReader that's blocking the finalizer thread?

The ODRs (and the bytearrays) use most of the memory, but other objects aren't gc'ed as well. I've tried to isolate the problem by creating a little test app that does the same database-stuff as the "real" app.

The problem does not happen here, so it seems that something else is blocking the finalizer. What's the best way to find out which object could be responsible for the blocking?

As far as I understand, the finalizerthread can ony be blocked if some FinalizerMethod is blocking for some reason, like ~MyClass {      // some blocking stuff happening here... } right?

I can't find something like that in the code.

What makes you think it is not the OracleDataReader that causes this problem?

Once the time-out is up, nothing else gets finalized.

You can't find it in the code because you don't have the source code for the provider.

Presumably.

As I stated in my previous post, I isolated the database related stuff and the blocking is *not* occuring in that isolated environment.

So I'm pretty positive it's not the OracleDataReader that causes the problem.

That's why I believe some other object is blocking the finalizerthread.

I blamed the ODR in my initial post since it's taking up most of the space on the heap, but the blocking seems to happen somewhere else.

But how do I find out where?!

Stefan, You can try to find if there's a deadlock using the SOSEX extension for WinDbg - it has a very useful !dlk command that might help.

Google (or Bing!) on "Sosex" to download the extension

Hi Amit, thanks for the hint.

!dlk yields the output "No deadlocks detected" :( !Waitlist from SIEExtPub does not produce any output. I  tried removing all the finalizers from the code - though this wouldn't be a permanent solution, I just wanted to see if the problem would persist.

It does. Can I conclude from this that there are blocking finalizers in the third party components we use?

Is there any way to determine which object's finalize is being called from the finalizerthread?

Is there any way to see what's in the freachable queue?

I keep posting my "findings", maybe somebody can make a rhyme out of this: The callstack of the finalizer thread looks like this when it's dead: >

Ntdll.dll!KiFastSystemCallRet()          [Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll]         ntdll.dll!NtWaitForSingleObject()  + 0xc bytes         kernel32.dll!WaitForSingleObject()  + 0x12 bytes         ole32.dll!CoQueryProxyBlanket()  + 0x5c5 bytes         ole32.dll!775d1e50()          ole32.dll!ReleaseStgMedium()  + 0x12e7 bytes         ole32.dll!CoGetObject()  + 0xd26 bytes         rpcrt4.dll!NdrProxySendReceive()  + 0x40 bytes         rpcrt4.dll!NdrProxySendReceive()  + 0x138 bytes         rpcrt4.dll!NdrProxySendReceive()  + 0xcd bytes         rpcrt4.dll!RpcBindingSetObject()  + 0x4d bytes         ole32.dll!CoGetObject()  + 0x901 bytes         ole32.dll!CoCreateObjectInContext()  + 0xd1c bytes         mscorwks.dll!CtxEntry::EnterContextOle32BugAware()  + 0x2b bytes         mscorwks.dll!CtxEntry::EnterContext()  + 0x168 bytes         mscorwks.dll!RCWCleanupList::ReleaseRCWListInCorrectCtx()  + 0xf7 bytes         mscorwks.dll!RCWCleanupList::CleanupAllWrappers()  + 0xdbf50 bytes         mscorwks.dll!SyncBlockCache::CleanupSyncBlocks()  + 0xdb bytes         mscorwks.dll!Thread::DoExtraWorkForFinalizer()  + 0x4c4d7 bytes         mscorwks.dll!WKS::GCHeap::FinalizerThreadWorker()  + 0x89 bytes         mscorwks.dll!Thread::DoADCallBack()  - 0x1411f3 bytes         mscorwks.dll!Thread::ShouldChangeAbortToUnload()  - 0x14036b bytes         mscorwks.dll!Thread::ShouldChangeAbortToUnload()  - 0x140445 bytes         mscorwks.dll!ManagedThreadBase_NoADTransition()  + 0x32 bytes         mscorwks.dll!ManagedThreadBase::FinalizerBase()  + 0xd bytes         mscorwks.dll!WKS::GCHeap::FinalizerThreadStart()  + 0xa9 bytes         mscorwks.dll!Thread::intermediateThreadProc()  + 0x46 bytes         kernel32.dll!GetModuleFileNameA()  + 0x1ba bytes    I took several snapshots of the finalizers stack when everything is working.

The only thing that's missing from the stack when everything is working are the ole32/rpcrt4-frames, but this could be coincidence.

Or could this be some kind of com issue I'm having here?

It is cleaning up the RCW for an out-of-process COM component.

Excellent candidate for timeouts of course.

More evidence for your data provider being the problem.

We're using devart which is 100%-managed code (at least that's what the vendor claims here: http://www.devart.com/dotconnect/oracle/).

Besides I couldn't reproduce the issue when I isolated the db stuff.

Why all that hate for oracle?

;)

Stefan, when looking at the unmanaged stack trace can you try the kb command which also gives you the arguments being passed to each method.

The first argument passed to ntdll!NtWaitForSingleObject is the handle id of the wait handle which can be of one of the following types:

Discussion Title: Finding memory leaks using windbg
Title Keywords: Finding  memory  leaks  using  windbg