Home > Uncategorized > Minimizing esxtop output for targeted stats captures

Minimizing esxtop output for targeted stats captures

You can easily use `esxtop` in batch mode to generate a CSV file containing specific performance stats.

There are some caveats to this process, but you can make it through relatively unscathed capturing, generally, only what you need.


esxtop is too verbose:
The following 47 fields are always captured, even if perform the a minimization of fields by creating a configuration file as we will do shortly:

(PDH-CSV 4.0) (UTC)(0) # aka the time
\\host.local\Memory\Memory Overcommit (1 Minute Avg)
\\host.local\Memory\Memory Overcommit (5 Minute Avg)
\\host.local\Memory\Memory Overcommit (15 Minute Avg)
\\host.local\Physical Cpu Load\Cpu Load (1 Minute Avg)
\\host.local\Physical Cpu Load\Cpu Load (5 Minute Avg)
\\host.local\Physical Cpu Load\Cpu Load (15 Minute Avg)
\\host.local\Physical Cpu(0)\% Processor Time
\\host.local\Physical Cpu(1)\% Processor Time
\\host.local\Physical Cpu(2)\% Processor Time
\\host.local\Physical Cpu(3)\% Processor Time
\\host.local\Physical Cpu(4)\% Processor Time
\\host.local\Physical Cpu(5)\% Processor Time
\\host.local\Physical Cpu(6)\% Processor Time
\\host.local\Physical Cpu(7)\% Processor Time
\\host.local\Physical Cpu(_Total)\% Processor Time
\\host.local\Physical Cpu(0)\% Util Time
\\host.local\Physical Cpu(1)\% Util Time
\\host.local\Physical Cpu(2)\% Util Time
\\host.local\Physical Cpu(3)\% Util Time
\\host.local\Physical Cpu(4)\% Util Time
\\host.local\Physical Cpu(5)\% Util Time
\\host.local\Physical Cpu(6)\% Util Time
\\host.local\Physical Cpu(7)\% Util Time
\\host.local\Physical Cpu(_Total)\% Util Time
\\host.local\Memory\Machine MBytes
\\host.local\Memory\Kernel MBytes
\\host.local\Memory\NonKernel MBytes
\\host.local\Memory\Free MBytes
\\host.local\Memory\Kernel Managed MBytes
\\host.local\Memory\Kernel MinFree MBytes
\\host.local\Memory\Kernel Reserved MBytes
\\host.local\Memory\Kernel Unreserved MBytes
\\host.local\Memory\Kernel State
\\host.local\Memory\PShare Shared MBytes
\\host.local\Memory\PShare Common MBytes
\\host.local\Memory\PShare Savings MBytes
\\host.local\Memory\Swap Used MBytes
\\host.local\Memory\Swap Target MBytes
\\host.local\Memory\Swap MBytes Read/sec
\\host.local\Memory\Swap MBytes Write/sec
\\host.local\Memory\Total Compressed MBytes
\\host.local\Memory\Total Saved By Compression MBytes
\\host.local\Memory\Memctl Current MBytes
\\host.local\Memory\Memctl Target MBytes
\\host.local\Memory\Memctl Max MBytes
\\host.local\Power\Power Usage Now Watts
\\host.local\Power\Power Usage Cap Watts

The anatomy of the esxtop config file:

Within `esxtop`, there are eight categories of counters. Within these categories, there are counter sets. Within these counter sets, there are counters.

Take a quick look at this gist I just compiled; also take a look at this VMware KB article to get a better idea of the counters.

Each config file is set up so each line is the “category of counters.” For example, line 1 happens to be the CPU category of counters.

The lines are as follows (along with their declarative letter), in their correct config file order:

c:cpu
m:memory
d:disk adapter
u:disk device
v:disk VM
n:network
i:interrupt
p:power mgmt

Each line is made up of a set of letters that declare which counter sets are enabled. The letter case acts a boolean controlling whether the counter set is enabled or disabled.

Referring to the previously mentioned gist again, you will see that each counter set has a letter assignment.

Generating a few esxtop config files:

An example:
If you were to refer to the KB titled “Using esxtop to identify storage performance issues for ESX / ESXi (multiple versions) (1008205)”, you might want to capture CMDS/s, DAVG/cmd, KAVG/cmd , and GAVG/cmd.

Referring to the previously mentioned gist again, you can most easily perform a “lookup” to find which counter set includes each of these specific counters.

Here is a summary:

CMDS/s: contained within the disk adapter (line 3), the disk device (line 4) and the disk VM (line 5) lines, in the IOSTATS counter set, which is contained in letter E, G, and F respective to the lines/category of counters.
DAVG/cmd: contained within the disk adapter and the disk device lines, in the LATSTATS/cmd counter set, which is contained in the letter F, and I respective to the lines/category of counters.
KAVG/cmd: contained within the disk adapter and the disk device lines, in the LATSTATS/cmd counter set, which is contained in the letter F, and I respective to the lines/category of counters.
GAVG/cmd: contained within the disk adapter and the disk device lines, in the LATSTATS/cmd counter set, which is contained in the letter F, and I respective to the lines/category of counters.

So, if we wanted to isolate our `esxtop` invocation to just include these counters (and the other 47 “mandatory” counters mentioned previously), we could go through the curses GUI, select the letters (with `f`), and save a conf file (with `W`), or we could just manually generate a conf file from what we already know.

Let’s target all of the above counters.

echo abcdefghijklmnopqrstuvwxyz > /esxtop_kb1008205.conf
echo abcdefghijklmnopqrstuvwxyz >> /esxtop_kb1008205.conf
echo abcdEFghijklmnopqrstuvwxyz >> /esxtop_kb1008205.conf
echo abcdefGhIjklmnopqrstuvwxyz >> /esxtop_kb1008205.conf
echo abcdeFghijklmnopqrstuvwxyz >> /esxtop_kb1008205.conf
echo abcdefghijklmnopqrstuvwxyz >> /esxtop_kb1008205.conf
echo abcdefghijklmnopqrstuvwxyz >> /esxtop_kb1008205.conf
echo abcdefghijklmnopqrstuvwxyz >> /esxtop_kb1008205.conf
echo 5d >> /esxtop_kb1008205.conf #this is the version, followed by the declaritive letter of the start screen for esxtop (irrelevant to batch mode).

Now, if we check for the capital letters and return the line number, we will verify what we’ve enabled:

cat esxtop_kb1008205.conf | grep -no [A-Z]

This will return:

3:E
3:F
4:G
4:I
5:F

Note that the default config file is located at `/.esxtop50rc`

Capture perf data to a CSV:
1) run esxtop in batch mode, only capturing counters configured previously once every 5 seconds, for the next 2 minutes (120 seconds / 5 second = 24 enumerations), and write the output to a file esxtoptest.csv:

#with no compression
nohup esxtop -b -c ./esxtop_kb1008205.conf -d 5 -n 24 > esxtop_storage_test.csv &
tail -f esxtop_storage_test.csv | awk -F "," {'print $1'}

#with compression 
nohup esxtop -b -c ./esxtop_kb1008205.conf -d 5 -n 24 | gzip > esxtop_storage_test.csv.gz &

Only target specific “entities”:
An entity as defined by VMware is any: SchedGrp (process), Adapter, Device, NetPort (, or InterruptVector.

To generate the entity list:

esxtop -export-entity /entity.out

After reviewing the list with `less`, create a file with the specific entities that you wish to include, note that you MUST include the header for the type of entity:

cat /entity.out | grep vmhba0 > /entity.in
vi /entity.in
## in this case
### add a line "Adapter" before the adapter
### add a line "Device" before the device

Now that you have generated a file to watch your vmhba0 adapter(s) and device(s), run:

nohup esxtop -import-entity /entity.in -b -c ./esxtop_kb1008205.conf -d 5 -n 24 | gzip > esxtop_storage_test_vmhba0.csv.gz &

Open with perfmon:

Copy to a Windows box and decompress the perf data for analysis:
1) use pscp or whatever to copy the file:

pscp root@esxhost:/esxtop_storage_test.csv.gz .

2) use 7za or whatever to decompress the file:

7za x esxtop_storage_test.csv.gz

Open the stats in perfmon:
1) start perform
2) open log file
3) select esxtop_storage_test.csv as your source (yes, times are UTC0)
4) Go to the data tab, and select the counters you wish to view

Generating some i/o:
IOmeter is a piece of software that’s generally used to perform i/o tests against SANs.
VMware actually recommends it’s use, and even created a template for use that’s available in a KB article.

Obviously, IOmeter takes some activity into account, but you’re better off writing your own stress tests to mimic your actual I/O activity, or using other more specific testing tools.

Is it worth it?
You decide. There is the performance tab in the vSphere Client, and a way to collect historic data past one hour.

Reference:

Advertisements
  1. No comments yet.
  1. January 17, 2017 at 10:09 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: