Friday, October 05, 2007

Using KrView (KernView) to profile CPU usage by kernel-mode and user-mode processes and drivers

KrView (kernrate viewer) is a profiling tool for tracking CPU utilization by kernel-mode and user-mode processes that can be used to fine tune device drivers.

It can be hard to find out the real resource usage on a system. Take the System process for example, that has Interrupts, DPC's (Deferred Procedure Calls) and such, all calculated by subtracting the total usage from 100.

Also, due to context switches "bellow the radar" that happen between the system clock interval checks (system clock resolution - On Windows it's usually 15 ms on 64bit systems, 10ms on 32 bit systems) the numbers in task manager are hardly accurate.

There are other tools that use the Context Switch Deltas to print out more accurate numbers, but they still don't give you enough details about kernel mode drivers and such.


While other operating systems such as OpenBSD or Solaris have "systat vmstat" and the DTrace script intrstat respectively, Windows has KernRate (part of Windows 2003 rc) and KrView (to interpret and view kernrate data).

One tool that does give you a better idea of what happens in kernel mode and with drivers is KrView - the Kernrate Viewer.

I've got the system pushing quite a lot of data through the network, so, as you can see from the following:


b57nd60x 2214 67227 4 % 823329
tcpip 1128 67227 2 % 419474
ndis 1105 67226 2 % 410927


b57nd60x.sys is the Broadcom NetXtreme Gigabit Ethernet NDIS6.0 Driver. tcpis is the TCP/IP stack and ndis is the Network Driver Interface Specification, a standard API for NICs.

Start - run - cmd ->
"c:\Program Files\KrView\Kernrates\Kernrate_i386_XP.exe" > output.txt && notepad output.txt

C:\Program Files\KrView\Kernrates>Kernrate_i386_XP.exe
/==============================\
<>
\==============================/
Date: 2007/10/05 Time: 14:23:38
Machine Name: ALAP
Number of Processors: 2
PROCESSOR_ARCHITECTURE: x86
PROCESSOR_LEVEL: 6
PROCESSOR_REVISION: 0f0a
Physical Memory: 2046 MB
Pagefile Total: 5032 MB
Virtual Total: 2047 MB
PageFile1: \??\C:\pagefile.sys, 3067MB
OS Version: 6.0 Build 6000 Service-Pack: 0.0
WinDir: C:\Windows

Kernrate User-Specified Command Line:
Kernrate_i386_XP.exe


Kernel Profile (PID = 0): Source= Time,
Using Kernrate Default Rate of 25000 events/hit
Starting to collect profile data

***> Press ctrl-c to finish collecting profile data
===> Finished Collecting Data, Starting to Process Results

------------Overall Summary:--------------

P0 K 0:00:28.251 (42.0%) U 0:00:09.157 (13.6%) I 0:00:29.796 (44.3%) DPC 0:00:10.623 (15.8%) Interrupt 0:00:01.778 ( 2
%)
Interrupts= 370214, Interrupt Rate= 5509/sec.

P1 K 0:00:17.425 (25.9%) U 0:00:09.219 (13.7%) I 0:00:40.560 (60.4%) DPC 0:00:10.436 (15.5%) Interrupt 0:00:01.544 ( 2
%)
Interrupts= 370229, Interrupt Rate= 5509/sec.

TOTAL K 0:00:45.677 (34.0%) U 0:00:18.376 (13.7%) I 0:01:10.356 (52.3%) DPC 0:00:21.060 (15.7%) Interrupt 0:00:03.322 ( 2
%)
Total Interrupts= 740443, Total Interrupt Rate= 11018/sec.


Total Profile Time = 67205 msec

BytesStart BytesStop BytesDiff.
Available Physical Memory , 200392704, 549707776, 349315072
Available Pagefile(s) , 4235825152, 4270665728, 34840576
Available Virtual , 2122371072, 2122371072, 0
Available Extended Virtual , 0, 0, 0

Total Avg. Rate
Context Switches , 949913, 14135/sec.
System Calls , 2540971, 37809/sec.
Page Faults , 469998, 6993/sec.
I/O Read Operations , 15073, 224/sec.
I/O Write Operations , 37020, 551/sec.
I/O Other Operations , 22403, 333/sec.
I/O Read Bytes , 378399597, 25104/ I/O
I/O Write Bytes , 605689075, 16361/ I/O
I/O Other Bytes , 1259301, 56/ I/O

-----------------------------

Results for Kernel Mode:
-----------------------------

OutputResults: KernelModuleCount = 184
Percentage in the following table is based on the Total Hits for the Kernel

Time 45690 hits, 25000 events per hit --------

Module Hits msec %Total Events/Sec
ntkrnlpa 37371 67226 81% 13897524
b57nd60x 2214 67227 4% 823329
tcpip 1128 67227 2% 419474
ndis 1105 67226 2% 410927
hal 1101 67226 2% 409439
win32k 662 67227 1% 246180
dne2000 651 67227 1% 242090
afd 317 67227 0% 117884
iaNvStor 256 67226 0% 95201
VMNetSrv 147 67227 0% 54665
NETIO 131 67226 0% 48716
nvlddmkm 108 67227 0% 40162
fltmgr 93 67226 0% 34584
Ntfs 92 67226 0% 34212
CVPNDRVA 91 67227 0% 33840
dxgkrnl 76 67227 0% 28262
USBPORT 30 67227 0% 11156
rspndr 20 67227 0% 7437
iaStor 17 67226 0% 6321
pacer 13 67227 0% 4834
sptd 11 67226 0% 4090
usbuhci 7 67227 0% 2603
fileinfo 7 67226 0% 2603
MpFilter 6 67227 0% 2231
ecache 5 67226 0% 1859
HIDPARSE 4 67227 0% 1487
usbhub 4 67227 0% 1487
HIDCLASS 3 67227 0% 1115
CLASSPNP 3 67226 0% 1115
luafv 2 67227 0% 743
fvevol 2 67226 0% 743
cdd 1 67227 0% 371
ATMFD 1 67227 0% 371
rdbss 1 67227 0% 371
hidusb 1 67227 0% 371
mouhid 1 67227 0% 371
NETw4v32 1 67227 0% 371
usbehci 1 67227 0% 371
watchdog 1 67227 0% 371
volsnap 1 67226 0% 371
ksecdd 1 67226 0% 371
ataport 1 67226 0% 371
volmgr 1 67226 0% 371
acpi 1 67226 0% 371

======== END OF RUN =======
======== NORMAL END OF RUN =======


So, is this all? Nope, you can "zoom in" on individual modules (use multiple -z entries if you need to).
kernrate -xk -u -z intelppm -j srv**http://msdl.microsoft.com/download/symbols > krlog.txt


So, how does zooming in work?
C:\Users\cmihai>kernrate -xk -u -z hal

----- Zoomed module hal.dll (Bucket size = 16 bytes, Rounding Down) --------
Percentage in the following table is based on the Total Hits for this Zoom Module

Time 5 hits, 25000 events per hit --------
Module Hits msec %Total Events/Sec
READ_PORT_ULONG 1 4726 16 % 5289
READ_PORT_USHORT 1 4726 16 % 5289
KeReleaseQueuedSpinLock 1 4726 16 % 5289
KeAcquireInStackQueuedSpinLockRaiseToSynch 1 4726 16 % 5289
KfAcquireSpinLock 1 4726 16 % 5289
HalpGenerateInterrupt 1 4726 16 % 5289

================================= END OF RUN ==================================
============================== NORMAL END OF RUN ==============================


Other Debugging Links

0 comments: