Blue screen intermittently due to ntkrnlmp.exe


hi,

i having 8 node cluster windows 2008 r2 datacenter edition, running script create csv disks. had encountered the blus scrren dump on host.

i have analyze memory dump , shows below details

microsoft (r) windows debugger version 6.11.0001.404 amd64  copyright (c) microsoft corporation. rights reserved.      loading dump file [c:\users\administrator.whqldcrtp\desktop\memory.dmp]  kernel summary dump file: kernel address space available    symbol search path is: srv*c:\websymbols*http://msdl.microsoft.com/download/symbols  executable search path is:   windows 7 kernel version 7600 mp (16 procs) free x64  product: server, suite: terminalserver datacenter singleuserts  built by: 7600.16385.amd64fre.win7_rtm.090713-1255  machine name:  kernel base = 0xfffff800`01e19000 psloadedmodulelist = 0xfffff800`02056e50  debug session time: wed oct 12 10:28:22.889 2011 (gmt-7)  system uptime: 0 days 6:22:40.045  loading kernel symbols  ...............................................................  ................................................................  ..........................  loading user symbols    loading unloaded module list  ......  *******************************************************************************  *                                                                             *  *                        bugcheck analysis                                    *  *                                                                             *  *******************************************************************************    use !analyze -v detailed debugging information.    bugcheck 9e, {fffffa8055a04190, 4b0, 0, 0}    caused : netft.sys ( netft!netftwatchdogtimerdpc+b9 )    followup: machineowner  ---------    8: kd> !analyze -v  *******************************************************************************  *                                                                             *  *                        bugcheck analysis                                    *  *                                                                             *  *******************************************************************************    user_mode_health_monitor (9e)  1 or more critical user mode components failed satisfy health check.  hardware mechanisms such watchdog timers can detect basic kernel  services not executing. however, resource starvation issues, including  memory leaks, lock contention, , scheduling priority misconfiguration,  may block critical user mode components without blocking dpcs or  draining nonpaged pool.  kernel components can extend watchdog timer functionality user mode  periodically monitoring critical applications. bugcheck indicates  user mode health check failed in manner such graceful  shutdown unlikely succeed. restores critical services  rebooting and/or allowing application failover other servers.  arguments:  arg1: fffffa8055a04190, process failed satisfy health check within  	configured timeout  arg2: 00000000000004b0, health monitoring timeout (seconds)  arg3: 0000000000000000  arg4: 0000000000000000    debugging details:  ------------------      process_object: fffffa8055a04190    default_bucket_id:  vista_driver_fault    bugcheck_str:  0x9e    process_name:  system    current_irql:  2    last_control_transfer:  fffff880042026a5 fffff80001e8af00    stack_text:    fffff880`0213d518 fffff880`042026a5 : 00000000`0000009e fffffa80`55a04190 00000000`000004b0 00000000`00000000 : nt!kebugcheckex  fffff880`0213d520 fffff800`01e96fa6 : fffff880`0213d618 00000000`00000000 00000000`406c0088 00000000`00000000 : netft!netftwatchdogtimerdpc+0xb9  fffff880`0213d570 fffff800`01e96326 : fffff880`0420f100 00000000`0016752c 00000000`00000000 00000000`00000000 : nt!kiprocesstimerdpctable+0x66  fffff880`0213d5e0 fffff800`01e96e7e : 00000035`7540ea44 fffff880`0213dc58 00000000`0016752c fffff880`02117b08 : nt!kiprocessexpiredtimerlist+0xc6  fffff880`0213dc30 fffff800`01e96697 : 00000035`7b0fdfc2 fffffa80`0016752c fffffa80`53674d38 00000000`0000002c : nt!kitimerexpiration+0x1be  fffff880`0213dcd0 fffff800`01e936fa : fffff880`02115180 fffff880`021202c0 00000000`00000002 fffff880`00000000 : nt!kiretiredpclist+0x277  fffff880`0213dd80 00000000`00000000 : fffff880`0213e000 fffff880`02138000 fffff880`0213dd40 00000000`00000000 : nt!kiidleloop+0x5a      stack_command:  kb    followup_ip:   netft!netftwatchdogtimerdpc+b9  fffff880`042026a5 cc              int     3    symbol_stack_index:  1    symbol_name:  netft!netftwatchdogtimerdpc+b9    followup_name:  machineowner    module_name: netft    image_name:  netft.sys    debug_flr_image_timestamp:  4a5bc48a    failure_bucket_id:  x64_0x9e_netft!netftwatchdogtimerdpc+b9    bucket_id:  x64_0x9e_netft!netftwatchdogtimerdpc+b9    followup: machineowner  ---------  
it crashing @ "netft.sys" "microsoft failover cluster virtual driver"

any body has idea same. 

thank in advance.

with regards,
varun

i'd recommend review following blog:

http://blogs.technet.com/b/askcore/archive/2009/06/12/why-is-my-2008-failover-clustering-node-blue-screening-with-a-stop-0x0000009e.aspx

the blue screen due windows 2008 cluster's default hang recovery action. can change default behavior, won't solve hanging issue...it prevents box bsod.

hope helps.


visit blog multi-site clustering - http://msmvps.com/blogs/jtoner


Windows Server  >  High Availability (Clustering)



Comments

Popular posts from this blog

server manager error: ADAM.events.xml could not be enumerated.

Cannot access Anywhere Access using domain name?

WMI Failure: Unable to update Local Resource Group