Friday, September 14, 2018

Viewing Data from an Extended Event File

I got the idea for this post from a blog post by Paul Randall: Capturing Wait Stats for a Single Operation. He shows how to use an extended event session to capture wait stats and then how to pull that data into a temp table. It was very helpful but it didn't show how to build the query if you were using a different extended event session.

Extended Events were introduced as a replacement for SQL profiler and server side tracing. They are a great lightweight way to gather data from a SQL server. However, once you have collected the data into a file it can be difficult to use that data. This run book will detail how to pull the data from and Extended Event file into a temp table that can be viewed, sorted and aggregated.

First we need to gather data from an Extended Event session. Connect to a SQL server and run the below code to create the extended event session. This will create an Xevent session that grabs rpc_completed, sql_statement_completed and sql_batch_completed.

NOTE: Notice the session is logging data to this path: D:\DBA_Tools\XEvents\DBA_JR_QueryText . Modify this for your environment.

NOTE2: Notice I have some lines commented out. You can uncomment them if you want to only capture events that meet a threshold.

IF EXISTS (SELECT * FROM sys.server_event_sessions WHERE name='DBA_JR_QueryText')
    WHERE ([package0].[greater_than_uint64]([sqlserver].[database_id],(4))
        AND [package0].[equal_boolean]([sqlserver].[is_system],(0))
        --AND [duration]>(1000000)               -- THIS IS IN MICROSECONDS    
        --AND [logical_reads]>(2)
        --AND [physical_reads]>(4)
        --AND [writes]>(1)
    WHERE ([package0].[greater_than_uint64]([sqlserver].[database_id],(4))
        AND [package0].[equal_boolean]([sqlserver].[is_system],(0))
        --AND [duration]>(1000000)               -- THIS IS IN MICROSECONDS   
        --AND [logical_reads]>(2)
        --AND [physical_reads]>(4)
        --AND [writes]>(1)
    WHERE ([package0].[greater_than_uint64]([sqlserver].[database_id],(4))
        AND [package0].[equal_boolean]([sqlserver].[is_system],(0))
        --AND [duration]>(1000000)               -- THIS IS IN MICROSECONDS
        --AND [logical_reads]>(2)
        --AND [physical_reads]>(4)
        --AND [writes]>(1)
ADD TARGET package0.event_file(SET filename=N'D:\DBA_Tools\XEvents\DBA_JR_QueryText'),

Once the query has run, locate the new Event Session in SSMS and right click it and select Start Session

Let the session run for a few minutes to grab some data then. Right click it again and select Stop Session. Hopefully we have some data in the file. We can check by running the below query. Again, modify the path if you did so in step 2

-- See how many events are in our file
FROM sys.fn_xe_file_target_read_file

If this query returns a number we can move on, otherwise go back to step 3 and run the extended event session for longer. Next we want to pull all the data into temp table and we will use a XML data type to hold the event data. Lets build a temp temp table, import the file, and look at the raw data. We will need to examine the XML to build the next query

DROP TABLE IF EXISTS #raweventdata
-- pull the data into a temp table
CREATE TABLE #RawEventData (
    event_data XML);
-- Pull the data into the table
INSERT INTO #RawEventData (event_data)
    CAST(event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file(
-- Now lets look at the raw data
SELECT FROM #RawEventData AS red
You should now get some data back that looks like this. We have our identity column and then a clickable XML column

Next we want to click on one of the XML fields so we can see what data is in there. I have put boxes around four different elements in the XML.

The red box is for the entire event, notice it has multiple fields (called attributes)

The blue box is a data tag that only has a name attribute and a subtag named value

The purple box is a data tag with a name attribute and two subtags: value and text

The green box is an action tag with a name and a package attribute and it has a value subtag. If you look up at the code that created the extended event you will notice we defined the actions. These are pieces of data that you don't get by default when you add a certain extended event. I have redacted the actual value since that is a username.

Now we need to parse the XML using XQuery to get the data out and present it to the screen. I have moved the data and action tags around so my result set is ordered how I like it. Don't run this yet, just look it over and move on to the next step.

-- Extract the data from the XML
    event_data.value ('(/event/@name)[1]', 'varchar(100)') AS name,
    event_data.value ('(/event/@timestamp)[1]', 'varchar(100)') AS timestamp,
    event_data.value ('(/event/action[@name=''username'']/value)[1]', 'varchar(100)') AS username,
    event_data.value ('(/event/action[@name=''session_id'']/value)[1]', 'varchar(100)') AS spid,
    event_data.value ('(/event/action[@name=''client_app_name'']/value)[1]', 'varchar(100)') AS client_app_name,
    event_data.value ('(/event/action[@name=''client_hostname'']/value)[1]', 'varchar(100)') AS client_hostname,
    event_data.value ('(/event/action[@name=''database_name'']/value)[1]', 'varchar(100)') AS databasename,
    event_data.value ('(/event/data[@name=''object_name'']/value)[1]', 'varchar(100)') AS objectname,
    event_data.value ('(/event/data[@name=''cpu_time'']/value)[1]', 'varchar(100)') AS cpu_time,
    event_data.value ('(/event/data[@name=''duration'']/value)[1]', 'VARCHAR(100)') AS duration,
    event_data.value ('(/event/data[@name=''physical_reads'']/value)[1]', 'varchar(100)') AS physical_reads,
    event_data.value ('(/event/data[@name=''logical_reads'']/value)[1]', 'varchar(100)') AS logical_reads,
    event_data.value ('(/event/data[@name=''writes'']/value)[1]', 'varchar(100)') AS writes,
    event_data.value ('(/event/data[@name=''result'']/text)[1]', 'varchar(100)') AS result,
    event_data.value ('(/event/data[@name=''row_count'']/value)[1]', 'varchar(100)') AS row_count,
    event_data.value ('(/event/data[@name=''statement'']/value)[1]', 'varchar(100)') AS sqlstatement

FROM #RawEventData AS red

This looks pretty complicated and working with XML can be pretty complicated but we are only pulling values so lets break it down.

The first line :
    event_data.value ('(/event/@name)[1]', 'varchar(100)') AS name,

We are selecting the event_data column in our temp table and extracting some data. Look back up at the red box in the picture. The event tag is our top level tag so it will be the first element in all of our lines. We are then trying to get name attribute which is why we use /event/@name then we cast that as a varchar(100). The second line is also an attribute for the event tag, the timestamp attribute so line looks the same.

The third line:
    event_data.value ('(/event/action[@name=''username'']/value)[1]', 'varchar(100)') AS username,

We are now moving down to a subtag of event, an action tag named username and we want to get the value subtag of that action so we use /event/action[@name=''username'']/value . Look at the green box in the image above and you can see how they relate.

Now move down a few more line:
    event_data.value ('(/event/data[@name=''duration'']/value)[1]', 'varchar(100)') AS duration,

This is very similar to the last example except this time we are accessing a data tag instead of an action tag. This corresponds to the blue box in our image.

If we skip a few more lines we see:
    event_data.value ('(/event/data[@name=''result'']/text)[1]', 'varchar(100)') AS result,

This is another data tag but this time the subtag we are trying to get the value for is a text tag instead of a value tag. See the purple box in the image, note we could have grabbed both the text and the value subtags here.
If you run the code from step 10 now, you should get multiple rows and multiple columns in your output. I have scrubbed the host names and usernames from my output but they did return data.

One thing to note is that I cast all the columns as VARCHAR in my code. If we wanted to do any aggregation or computations using some of the values (like cpu_time, reads and writes) we would want to cast those as integers.