본문 바로가기
OS-OE Knowledge/Windows KB

BSOD/Bug Check 0x101: CLOCK_WATCHDOG_TIMEOUT

by 스쳐가는인연 2013. 3. 30.

Windows server 2008에서 BSOD 0x101이 발생하는 경우 접근하는 방법

 

웹상에서 확인해보니 BSOD 0x101을 WinDbg를 이용해서 분석하는 방법은 여러가지 방법으로 오류에 접근하고 있었으나, 아래 방법이 쉽고 빠르게 접근할 수 있어 보이더라.(내 생각인거 ~_~;;)

 

하지만, 결과를 단정할 수는 없겠다.

 

Bug Check 0x101: CLOCK_WATCHDOG_TIMEOUT
http://msdn.microsoft.com/en-us/library/ff557211(v=vs.85).aspx

 

Cause
The specified processor is not processing interrupts. Typically, this occurs when the processor is nonresponsive or is deadlocked.

 

These actions might prevent an error like this from happening again:
 1.Download and install updates and device drivers for your computer from Windows Update.
 2.Scan your computer for computer viruses.
 3.Check your hard disk for errors.

 

HW의 Firmware 및 Driver 그리고 OS의 Patch 상태가 최신인지 점검이 되어야 한다.

 

WinDbg 툴을 이용해서 분석하는 법

간단한 BSOD가 아닌 경우 Small Dump(mini dump)에서는 신뢰할 수 있는 정보를 얻을 수 없는 경우가 많다.

올바른 분석을 위해서는 Kernel Dump 이상을 권장하는 것 같다.

 

0: kd> !analyze v 

*******************************************************************************

*                                                                             *

*                        Bugcheck Analysis                                    *

*                                                                             *

*******************************************************************************

 

CLOCK_WATCHDOG_TIMEOUT (101)

An expected clock interrupt was not received on a secondary processor in an

MP system within the allocated interval. This indicates that the specified

processor is hung and not processing interrupts.

Arguments:

Arg1: 0000000000000004, Clock interrupt time out interval in nominal clock ticks.

Arg2: 0000000000000000, 0.

Arg3: fffff880030d6180, The PRCB address of the hung processor.

Arg4: 000000000000002b, 0.

 

경험에 따르면, Arg3의 PRCB가 Arg4의 논리프로세서에서 Hung에 걸린 경우가 많다.

 

Debugging Details:

------------------

 

BUGCHECK_STR:  CLOCK_WATCHDOG_TIMEOUT_40_PROC

DEFAULT_BUCKET_ID:  WIN7_DRIVER_FAULT

PROCESS_NAME:  System

CURRENT_IRQL:  d

 

STACK_TEXT: 

fffff800`02be5a28 fffff800`016dfa89 : 00000000`00000101 00000000`00000004 00000000`00000000 fffff880`030d6180 : nt!KeBugCheckEx

fffff800`02be5a30 fffff800`01692eb7 : 00000000`00000000 fffff800`0000002b 00000000`00026161 00000000`00000000 : nt! ?? ::FNODOBFM::`string'+0x4e2e

fffff800`02be5ac0 fffff800`01bfc895 : fffff800`01c22460 fffff800`02be5c70 fffff800`01c22460 fffff880`00000000 : nt!KeUpdateSystemTime+0x377

fffff800`02be5bc0 fffff800`01684b73 : fffff800`00000000 00000000`ffffffff 00000000`00000000 00000000`00000000 : hal!HalpHpetClockInterrupt+0x8d

fffff800`02be5bf0 fffff800`01680342 : fffff800`017fae80 fffff800`00000001 00000000`00000001 fffff880`00000000 : nt!KiInterruptDispatchNoLock+0x163

fffff800`02be5d80 00000000`00000000 : fffff800`02be6000 fffff800`02be0000 fffff800`02be5d40 00000000`00000000 : nt!KiIdleLoop+0x32

 

STACK_COMMAND:  kb

SYMBOL_NAME:  ANALYSIS_INCONCLUSIVE

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: Unknown_Module

IMAGE_NAME:  Unknown_Image

DEBUG_FLR_IMAGE_TIMESTAMP:  0

FAILURE_BUCKET_ID:  X64_CLOCK_WATCHDOG_TIMEOUT_40_PROC_ANALYSIS_INCONCLUSIVE

BUCKET_ID:  X64_CLOCK_WATCHDOG_TIMEOUT_40_PROC_ANALYSIS_INCONCLUSIVE

 

Followup: MachineOwner

---------

 

0: kd> .bugcheck

Bugcheck code 00000101

Arguments 00000000`00000004 00000000`00000000 fffff880`030d6180 00000000`0000002b

 

0: kd> vertarget

Windows 7 Kernel Version 7601 (Service Pack 1) MP (64 procs) Free x64

Product: Server, suite: Enterprise TerminalServer SingleUserTS

Built by: 7601.17514.amd64fre.win7sp1_rtm.101119-1850

Machine Name:

Kernel base = 0xfffff800`01608000 PsLoadedModuleList = 0xfffff800`0184de90

Debug session time: Sun Mar 10 20:06:52.797 2013 (UTC + 9:00)

System Uptime: 3 days 7:38:30.811

 

Server OS에서 외 Client로 인식하는지 쩝 _ _;;

 

0: kd> !running

 

System Processors:  (ffffffffffffffff)

  Idle Processors:  (ffffffffffffffff) (0000000000000000) (0000000000000000) (0000000000000000)

 

All processors idle.

 

0: kd> !prcb 2b

PRCB for Processor 43 at fffff880030d6180:

Current IRQL -- 0

Threads--  Current fffff880030e1ec0 Next 0000000000000000 Idle fffff880030e1ec0

Processor Index 43 Number (0, 43) GroupSetMember 80000000000

Interrupt Count -- 01398caf

Times -- Dpc    00000001 Interrupt 00000001

         Kernel 01186c39 User      00000000

 

위에서 언급한대로, 논리 프로세서 2B(43)에서 PRCB(프로세서 컨트롤 블럭)가 확인된다.

 

0: kd> !numa

NUMA Summary:

------------

    Number of NUMA nodes : 4

    Number of Processors : 64

    MmAvailablePages     : 0x00F19C26

    KeActiveProcessors   :

    **************************************************************** (ffffffffffffffff)

 

    NODE 0 (FFFFF80001808C00):

        Group            : 65535 (Assigned, Committed, Assignment Adjustable)

        ProcessorMask    :  (ffff)

        ProximityId      : 0

        Capacity         : 16

        Seed             : 0x00000004

        Color            : 0x00000000

        MmShiftedColor   : 0x00000000

        Right            : 0x00000000

        Left             : 0x0000000F

        Zeroed Page Count: 0x000000000039CA06

        Free Page Count  : 0x0000000000000000

 

    NODE 1 (FFFFF880024BE380):

        Group            : 0 (Assigned, Committed, Assignment Adjustable)

        ProcessorMask    :  (ffff0000)

        ProximityId      : 1

        Capacity         : 16

        Seed             : 0x00000010

        Color            : 0x00000001

        MmShiftedColor   : 0x00000100

        Right            : 0x00000010

        Left             : 0x0000001F

        Zeroed Page Count: 0x00000000003CA5D3

        Free Page Count  : 0x0000000000000000

 

    NODE 2 (FFFFF88002BEA380):

        Group            : 0 (Assigned, Committed, Assignment Adjustable)

        ProcessorMask    :  (ffff00000000)

        ProximityId      : 2

        Capacity         : 16

        Seed             : 0x00000020

        Color            : 0x00000002

        MmShiftedColor   : 0x00000200

        Right            : 0x00000020

        Left             : 0x0000002F

        Zeroed Page Count: 0x00000000003D2B16

        Free Page Count  : 0x0000000000000005

 

    NODE 3 (FFFFF88003322380):

        Group            : 0 (Assigned, Committed, Assignment Adjustable)

        ProcessorMask    :  (ffff000000000000)

        ProximityId      : 3

        Capacity         : 16

        Seed             : 0x00000032

        Color            : 0x00000003

        MmShiftedColor   : 0x00000300

        Right            : 0x00000030

        Left             : 0x0000003F

        Zeroed Page Count: 0x00000000003CE082

        Free Page Count  : 0x0000000000000000

 

서버의 경우 매우 많은 프로세서를 사용하는 경우가 많기 때문에,

NUMA 정보를 확인하여, 현재 프로세서 그룹 정보를 확인한다.

 

논리 프로세서 43이 물리 프로세서 3번에 당하며,

특정 코어와 쓰레드에서 문제를 보였음을 확인이 가능해진다.

 

Logical
Processor

Physical
Socket

Core

thread

0

1

0

0

1

1

0

1

2

1

1

0

3

1

1

1

4

1

2

0

5

1

2

1

6

1

3

0

7

1

3

1

8

1

4

0

9

1

4

1

10

1

5

0

11

1

5

1

12

1

6

0

13

1

6

1

14

1

7

0

15

1

7

1

16

2

0

0

17

2

0

1

18

2

1

0

19

2

1

1

20

2

2

0

21

2

2

1

22

2

3

0

23

2

3

1

24

2

4

0

25

2

4

1

26

2

5

0

27

2

5

1

28

2

6

0

29

2

6

1

30

2

7

0

31

2

7

1

32

3

0

0

33

3

0

1

34

3

1

0

35

3

1

1

36

3

2

0

37

3

2

1

38

3

3

0

39

3

3

1

40

3

4

0

41

3

4

1

42

3

5

0

43

3

5

1

44

3

6

0

45

3

6

1

46

3

7

0

47

3

7

1

48

4

0

0

49

4

0

1

50

4

1

0

51

4

1

1

52

4

2

0

53

4

2

1

54

4

3

0

55

4

3

1

56

4

4

0

57

4

4

1

58

4

5

0

59

4

5

1

60

4

6

0

61

4

6

1

62

4

7

0

63

4

7

1

반응형