Dave's Travelogue: June 2009

Sunday, June 28, 2009

tech: Oracle 11 "new features" presentation

I recently attended an Oracle presentation, the audience was quite broad, architects, developers, middle management and the odd DBA. To be honest it was a bit of a sales pitch, focusing more on java and application server developments than the database (my background is predominately SAP applications running on Oracle databases so have little experience of Oracle Applications).

Still I did learn about the new features in 11g (I should be attending a one-day 11g technical overview day, at Oracle in Reading next month - this will be more detailed).

1) Grid computing enhancements

Oracle's are keen to see data centre and computing resource consolidation. Apparently Oracle worldwide since 1998, have consolidated 40 data centres into just 3! A more efficient use of resources is very fashionable, large production servers use a lot of electricity and need heavy air-conditioning, traditionally production servers are sized for "peak load",which often means for 90%+ there is a lot of wasted capacity.

This huge energy consumption is not just adding to global warming, data centres are finding they are hitting limits in terms of total energy usage (power supply, air conditioning requirements ...).

Hence Oracle have identified using resources more efficiently (i.e. grid computing) as the way forward, and this is very fashionable. (Traditionally there has been a silo mentality in the industry where business and services want maximum control over their environments and so have been reluctant to move to grid computing).

So what is new in 11g? The early versions of RAC were not easy, a challenge for even experienced DBA and Unix System Administrators work closely together. In April 2005, I attended one of the first runs of the 1-week course "Oracle 10g RAC" at Oracle, Reading. My colleagues had recently implement an Oracle 9i RAC installation, and this had involved a great deal of details knowledge of cluster configuration and some fairly tricky "low level o/s configuration". The course was interesting but tough, setting up RAC was not for the faint hearted.

Anyway in 11g, there have been further improvements and in theory new RAC nodes can now be added via Enterprise Manager.

Also for certain types of processing, there have been "locking issues" which has limited RAC installation to 3 instances? Apparently these issues have been resolved in 11g and some clients e.g. Amazon are running RAC across 30+ instances.

At the heart of RAC (making RAC work so much better than the old Oracle Parallel Server architecture) is Cache Fusion, it appears that Oracle has made significant improvements "behind the scenes" in 11g:

Cache Fusion Protocols
Oracle revolutionized clustering technology by introducing the Cache Fusion protocol that allows nodes in a cluster to communicate with each other’s memory using a high speed interconnect. Cache Fusion is one of the key elements of Oracle’s Real Application Cluster technology.
Oracle Database 11g introduces the next generation of Cache Fusion Protocols. The new workload aware protocols dynamically alter the inter-node messaging behavior depending on the type of workload being processed in order to maximize performance. For example, a newly introduced read optimized protocol significantly reduces the number of inter-node messages for read operations. Similarly, other new protocols optimize the messaging behavior for update and table scan operations. Oracle automatically selects the best protocol to use based on the workload profile. These optimizations to the Cache Fusion Protocols significantly reduce inter-node messaging and enhance the scalability of Real Application Clusters.
http://www.oracle.com/technology/deploy/performance/pdf/db_perfscale_11gr1_twp.pdf

2) Storage Management

There was quite a lot of discussion around storage management:

A Terabyte sized database that used to be a novelty some years back is almost a normal occurrence today as businesses try to capture as much data they possibly can about their customers and store it for as long as they can in order to understand their business environments better. Similarly, the volume of data being generated in the scientific research and health-care sector is literally exploding with some organizations expecting to generate more than a Petabyte (1000 Terabytes) of data every single year.
http://www.oracle.com/technology/deploy/performance/pdf/db_perfscale_11gr1_twp.pdf

As a bit of an "interesting aside" in the presentation, they talked about an interesting example of a modern application with a huge volume of data - emergency service management software. The regional/national emergency centres are now capturing two real time source of data (1) the GPS of every vehicle and (2) real maps of trouble-spots - typically differently locations with various types of pollution problems (with various specialist protective clothing, breathing ... requirements). This application is interesting because (a) the storage requirements for this sort of data capture are high and (b) the use of filtering/blade technologies to filter some of the data. Here most of the data is not of interest (most of the time, emergency vehicles are not entering pollution black-spots. Oracle have bought out a specialist spatial data application software vendor to help them develop these sorts of applications. (NB One of the themes of the presentation is the spectacular number of IT companies Oracle has bought over the last 5 years - the presenters made no comment about the recent Sun buyout - except that only staff in California head-office would know what the plans really are for the future of JAVA , Solaris x86 ...)

So what is Oracle doing new in 11g to help with these new storage management requirements: (1) compression routines even for some OLTP tables ("does a delivery company need fast access to complement orders over a year old"?) (2) secure files and (3) simplified volume management via Enterprise Manager. These were all discussed briefly during the presentation, returning again to the Oracle White Paper "11g Performance and Scalability"

i) OLTP Table Compression
Oracle introduced data compression in Oracle Database 9i primarily for use with Data Warehouses. Limitations on how data could be manipulated in compressed tables made it unsuitable for OLTP workloads. Oracle Database 11g removes these constraints, allowing traditional DML statements (INSERT, UPDATE, and DELETE) on compressed tables.

ii) SecureFiles
Unstructured data (documents, spreadsheets, images, multimedia, etc) is becoming more pervasive in the enterprise. This is due in large part to improvements in networking bandwidth enabling the sharing of rich unstructured content and the changing requirements of the regulatory landscape, such as Sarbanes-Oxley and HIPPA... SecureFiles, a feature introduced in Oracle Database 11g, was designed to break the performance barrier preventing enterprises from storing unstructured data in Oracle databases. Similar to LOBs, SecureFiles is a data structure built to store large objects in the database. SecureFiles however, offer a much richer feature set and a vast improvement in performance when compared with LOBs or file systems.

iii) Oracle Database 11g’s Scalable Management automates many administrative activities and provides an intuitive and robust toolset for managing simple and complex environments with Enterprise Manager Grid Control.

I couldn't see anyhting in this white paper (but was spoken briefly about in our presentation) regarding compression of securefiles, compression of redo logs before shipping (data guard). Also regarding OLTP Table Compression, the concepts of "data life-cycle" was discussed e.g. our deliveries start on fast access disc storage, then it is move to slower access compressed storage and finally it move to tape access (for long term audit purposes)?

3) High Availability / Data Guard

Data Guard has been enhanced to deal with network problems and redo log ship problems when these arise ...

Data Guard is an integral component of Oracle’s High Availability solution and a key component of the Maximum Availability Architecture (MAA). Data Guard automates the process of maintaining standby databases, which are activated during planned and unplanned production outages. Oracle Database 11g enhances the Data Guard capabilities by improving the redo transport performance. Asynchronous (ASYNCH) redo transport has been improved to stream redo data without interruption by decoupling the network messaging between the standby and production database from the shipment of redo data.
http://www.oracle.com/technology/deploy/performance/pdf/db_perfscale_11gr1_twp.pdf

4) Flashback Data Archive

I have not used Flashback extensively but it clearly is something Oracle have been pushing hard in Oracle 9i and 10g. In 11g Oracle have introduce the "Flashback Data Archive", removing the deoendency of flashback operation on undo data, which is available only for a limited time before it is aged out:

Oracle9i Database Release 2 introduced the proverbial time machine in the form of the Flashback Query, which allows you to select the pre-changed version of the data. For example, had you changed a value from 100 to 200 and committed, you can still select the value as of two minutes ago even if the change was committed. This technology used the pre-change data from the undo segments. In Oracle Database 10g, this facility was enhanced with the introduction of Flashback Versions Query, where you can even track the changes made to a row as long as the changes are still present in the undo segments.
However, there was a little problem: When the database is recycled, the undo data is cleaned out and the pre-change values disappear. Even if the database is not recycled, the data may be aged out of the undo segments to make room for new changes.
Since pre-11g flashback operations depend on the undo data, which is available only for a short duration, you can't really use it over an extended period of time or for more permanent recording such as for auditing. As a workaround, we resorted to writing triggers to make more permanent records of the changes to the database.
Well, don't despair. In Oracle Database 11g, Flashback Data Archive combines the best of both worlds: it offers the simplicity and power of the flashback queries but does not rely on transient storage like the undo. Rather, it records changes in a more permanent location, the Flashback Recovery Area.
http://www.oracle.com/technology/pub/articles/oracle-database-11g-top-features/11g-transactionmanagement.html

Saturday, June 27, 2009

tech: Deadlocks due collisions in "SQL Server's hashing algorithm" (June 2009 SQL Server UserGroup)

At the SQLServer usergroup there was a brief overview of a very technical problem - there appears to be a bug in SQLServer's "hashing algorithm" which under certain circumstances is triggering "deadlocks" to occur.

James' detailed blog:

http://blogs.conchango.com/jamesrowlandjones/archive/2009/05/28/the-curious-case-of-the-dubious-deadlock-and-the-not-so-logical-lock.aspx?CommentPosted=true#commentmessage

gives plenty of technical details behind this intriguing problem, and also the extensive collaborative investigation it took to unearth and document this bug.

I have gone through this great blog piece, highlighting what I see as the key concepts and steps. I have add some further references to key background concepts. This shows the process I went through to understand this issue, hopefully you find my comments a helpful introduction into understand this complex topics.

To start with an overview of what "deadlocks" are and how to monitor for them:

So what then is a deadlock? This is when two processes each hold a lock or set of locks that is incompatible with a lock that the other process is trying to acquire. As neither one process can complete the two processes are said to be deadlocked. Rather than look on in morbid fascination SQL Server chooses a victim based on which of the two processes is easiest to roll back based on the amount of resources it has consumed. The victim is then rolled back.
If you have trace flags 1204, 1222 running or are using profiler/SQL Trace to capture deadlock events you can gather information on what processes were involved in the deadlock and which one or ones were chosen as the victim.

James then links to another SQL Server MVP blog which shows how to setup a simple deadlock:

Below are the steps to generate a deadlock so that the behaviour of a deadlock can be illustrated:

-- 1) Create Objects for Deadlock Example
USE TEMPDB
CREATE TABLE dbo.foo (col1 INT)
INSERT dbo.foo SELECT 1
CREATE TABLE dbo.bar (col1 INT)
INSERT dbo.bar SELECT 1
-- 2) Run in first connection
BEGIN TRAN
UPDATE tempdb.dbo.foo SET col1 = 1
-- 3) Run in second connection
BEGIN TRAN
UPDATE tempdb.dbo.bar SET col1 = 1
UPDATE tempdb.dbo.foo SET col1 = 1
-- 4) Run in first connection
UPDATE tempdb.dbo.bar SET col1 = 1

Connection two will be chosen as the deadlock victim
ie.
Server: Msg 1205, Level 13, State 50, Line 1
Transaction (Process ID 56) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

http://wardyit.com/blog/blog/archive/2005/12/12/65.aspx

Next James, returning to his "deadlock problem" uses standard Microsoft guidelines for analyzing deadlocks.

To understand James blog article you need to be aware of a hobtids. In the Microsoft guidelines hobtid's (the "hash value" for the "index key range" we are locking) are mentioned very very briefly:

[LOCK] KEY. Identifies the key range within an index on which a lock is held or requested. KEY is represented as KEY: db_id:hobt_id(index key hash value). For example, KEY: 6:72057594057457664 (350007a4d329).
http://msdn.microsoft.com/en-us/library/ms178104.aspx

Returning again to James' deadlock problem. Using the standard "deadlock chart" functionality in SQL Profiler, he can see two SPIDs (processes) are trying the same hobtid:

It is telling me that SPID 82 is deadlocked against SPID 85. It is saying that they were both trying to get exclusive locks for a ROW in hobt 72057594693222400.
How was this possible? Why should SPID 82 have a lock on any data being loaded by SPID 85? There was no overlapping data that would allow this to happen. Why was this only starting to happen now? It just didn’t make any sense. Initially you might think that this is due to page splits or some other aspect of the storage where the data ranges may overlap but you will see these are both “Key Locks”.
I ran this query hoping for some clarification.

SELECT OBJECT_NAME(p.[object_id]) AS ObjectName
,p.[object_id] AS ObjectID
,i.name AS IndexName
,hobt_id AS HoBTID
,partition_id AS PartitionID
,partition_number AS PartitionNo
FROM sys.partitions p
JOIN sys.indexes i
ON p.index_id = i.index_id
AND p.[object_id] = i.[object_id]
WHERE hobt_id = 72057594693222400;

This just returned the name of the target table and the primary key, so not much help to be found there. What I can tell you was that both queries were loading into the same partition in the same hobt_id on the same index.

i.e. there shouldn't be be any lock conflict with inserting two distinct records in parallel? As these conflicts were occurring during a large batch load process of million of rows, it was not obvious which inserted rows had conflicting "index key hash values".

Having hit this apparent dead-end, James asked around for help:

I was struggling to come up with an answer. So I thought I’d run the scenario past my good friend Simon Sabin to see if he had any ideas. Sure enough he came up with an undocumented feature in SQL Server 2005 and above (courtesy of the late Ken Henderson) that helped us understand this particular problem. The feature is called %%lockres%%.

By exporting his SQL Profiler deadlock charts into XML format he was able to see the two distinct "lock resource hash values":

waitresource="KEY: 7:72057594693222400 (0801a4a3d7f1)

waitresource="KEY: 7:72057594693222400 (0801bf3924dc)

Feeding the "lock resource hash values" into the following query using the "undocumented feature" %%lockres%%, helped James to reducing the problem to inserting just 4 rows:

By filtering our query for the day and also by the lock resource hashes that had given us this problem we could really cut into this dataset. As the Date field is the first key of the clustered index this meant a clustered index seek was used and so %%lockres%% returned the key hashes for our clustered index which was want we wanted.

select *,%%lockres%% from dbo.Table(nolock) where %%lockres%% IN('(0801a4a3d7f1)','(0801bf3924dc)') and Date >= '2009-05-18'

Sure enough we had the same hash value appearing in different countries for the same data. Even though the data itself was unique SQL Server’s hashing algorithm was not able to achieve a unique result. When this occurs it is called a collision ... once you have the values you can see the problem with only 4 rows of data. Check this out:
CREATE TABLE #tmp
( [Date] DATETIME , CountryID TINYINT ,GroupID SMALLINT ,CodeID INT)

INSERT INTO #tmp
SELECT '20090519' ,2 ,4271 ,5835066 UNION ALL
SELECT '20090519' ,2 ,4619 ,2546652 UNION ALL
SELECT '20090519' ,3 ,4245 ,2651987 UNION ALL
SELECT '20090519' ,3 ,4657 ,5744053;

CREATE UNIQUE CLUSTERED INDEX tmp_PK ON #tmp
( [Date] ,CountryID ,GroupID ,CodeID )

SELECT %%lockres%% as LockHash ,* FROM #tmp
ORDER BY %%lockres%% ,CountryID

DROP TABLE #tmp

to see:

CountryID 2 could deadlock with CountryID 3!

James then goes on to replicate an actual deadlock based on these inserting these 4 rows and also discusses mitigation strategies, if you have got through to this point the rest of his blog piece is easy reading ;-)

tech: July 2009 SQL Server usergroup

The July 2009 SQL Server usergroup (at Microsoft London Office) was interesting. The format was informal, with plenty of Q&A and good audience participation.

18:30 - 18:40 Intro, what’s in the news, Q&A,
18:40 - 18:45 Nugget 1 %%lockres%% - James Rowland-Jones
18:45 - 18:50 Nugget 2 - Common MDX Gotchas when aggregating inside Calculated Members - Ashwani Roy
18:50 - 19:40 End-to-End SQL Server Performance Troubleshooting - Iain Kick
20:00 - 21:00 Replaying Trace Files - Martin Bell

The July 2009 SQL Server usergroup (at Microsoft London Office) was interesting. The format was informal, with plenty of Q&A and good audience participation.

Iain Kick gave an excellent presentation, on the various sources of information you should be using, when "SQL Server Performance Troubleshooting". There were striking parallels for me with the recent Oracle / SAP performance work I have been doing. I am going to go through a few of the key sources which Iain discussed and their equivalents with my recent SAP/Basis investigations.

1) Monitor and Examine your logs

"The SQL Server Errors Logs should be monitored"

Actually logs (error, system, job...) is an area I think he could have spoken more about, in Oracle / SAP sphere I regularly check:

a) The Oracle Alert Logs - as an Oracle DBA I am comfortable going through the alert log, this can highlight important database errors (space management problems, database corruptions, file system problems, ...)

b) Within SAP R3 systems, I regularly check the "System Logs" (SM21) and for "Short Dumps" (ST22) - these are tricky to read, however, if you are unsure about a message, you should investigate. Often I check historical the logs - have we seen these message before? Is this a warning or a more routine alert. Ideally your system should be tuned so that routine errors are not reported; for example when running the database checks from SAP, you adjust the check parameters so that only genuine errors are reported. However but this is not always possible.

c) Checking Job Logs (e.g. SM37 in SAP) - Job Logs and abnormal execution times are another useful indicator of performance problems. Batch jobs can also provide a useful benchmark of overall system performance and give an early indication of general performance problems (e.g. stable/routine jobs start taking significantly longer, or certain job perhaps at critical times start taking much longer ...)

At the end of Iain's talk, he went on to tell us how the great automated monitoring and analysis tools which Quest provide will analyse much for this and present the results in a more administrator friendly format. Quest have a strong reputation, their Oracle product TOAD, is very well regarded. Within a SAP system the Central Monitoring (RZ20) allows job to configure a large tree like structure of alerts - with a simple traffic light approach i.e. Red, Yellow or Green. On top of this SAPs Solution Manager product allows you to monitor all your SAP system for a single central location. This sort of functionality is also provided within Oracle by Enterprise Manager.

2) Windows Perfmon/Sysmon (passive background montoring)

I typically associate Perfmon with O/S level statistics (e.g Paging rates/sec, CPU % usage, and Disc Queue Lengths). However Windows Perfmon goes beyond this allowing you to collect metrics specifically for SQL Server.

Brent Ozar (also a SQL Server DBA working for Quest) gives a good overview of these in his podcast "Perfmon Tutorial Video" (http://www.brentozar.com/archive/2006/12/dba-101-using-perfmon-for-sql-performance-tuning/).

Brent recommends the "Best SQL Server Performance Monitor Counters to Analyze":

Memory – Available MBytes
Paging File – % Usage
Physical Disk – % Disk Time
Physical Disk – Avg. Disk Queue Length
Physical Disk – Avg. Disk sec/Read
Physical Disk – Avg. Disk sec/Write
Physical Disk – Disk Reads/sec
Physical Disk – Disk Writes/sec
Processor – % Processor Time
SQLServer:Buffer Manager – Buffer cache hit ratio
SQLServer:Buffer Manager – Page life expectancy
SQLServer:General Statistics – User Connections
SQLServer:Memory Manager – Memory Grants Pending
System – Processor Queue Length

I have watched several of Brent's video podcasts and they are excellent.

Most of the systems I have been working on recently are running on Solaris or HP (a few have been on Windows Servers). Typically for this sort of background monitoring of O/S level statistics, I have been using

vmstat - (Solaris/Linux) focusing n CPU queues and Paging
prstat - (Solaris/Linux) overall CPU % usage, CPU usage er process and CPU sage aggregated by user (prstat -a)
top - (HP) process monitoring

I always have my copy of "Unix for Oracle DBAs" (by Donald K. Burleson http://oreilly.com/catalog/9780596000660/) to hand at work. This a great pocket guide for Database and Application Server administrators, showing how to perform routine administration tasks under the various flavours Unix (Solaris, HP, Aix and Linux).

Return Iain's presentation, he did also mention a couple of other practical pointers when working with Perfmon (a) the default interval s 15secs, for analyzing SQL performance you may want to reduce the interval to 3 seconds and (b) you should for readable purposes report on more than 10 counters at one time (although if you are records to a file instead of charting, this is presumably not relevant).

NB The terms perfmon and sysmon are used synonymously, formally the product is called sysmon but the term perfmon appears to be more popular.

3) SQL Profilers (active foreground monitoring)

This is a very hand tool for (a) pin pointing problems within the database and (b) sampling live production loads which can then be played back in a PVT (Performance Volume Testing) system.

Iain also discussed filtering, for example if a particular user is experiencing performance problems you extract just sql they are running with details of executions and associated costs - enabling you to much more precisely idenitfy the source of the problem.

Next Iain gave a nice demonstration of how to review/step through a captured sql profile and monitor sysmon metrics at the same time! As you step through the "critical period" (a few seconds or a few minutes) where you have captured your bottleneck/deadlock/performance problem; on the lower half of the screen you can see a chart displaying the progress of your pre-selected sysmon metrics. I have seen this before, a great feature enabling you to pinpoint and accurately access your "resource problem": CPU utilisation, CPU queues, Paging Rates, Disc Queue Lengths (Data drives, Log drives, System/Software drives), locking contentions ...

There are tracing tools bult into Oracle, where you trace individual sessions or instances. These are powerful monitoring tools but in my experience are not so easy to use and do not directly link into other non-database monitor toos like sysmon?

4) SQL Execution Plans (SSMS - SQL Server Management Studio)

Some points on how to read SQL Execution Plans:

a) Start by reading the plan from the bottom right hand corner i.e. the lowest levels of access like "index reads" or "table scans"; then work backwards up the page, here the various data-sources are joined and aggregated until the final result-set is returned.

b) When using the graphical tools to represent the various elements making up an execution plan, the "fatter linking arrows" show immediately where the most expensive elements are.

c) Alternatively you can use a text-based programmatic approach

SET STATISTICS TIME { ON | OFF }
SET STATISTICS IO { ON | OFF }
http://msdn.microsoft.com/en-us/library/ms190287.aspx

d) "In general avoid full table scans, index reads are better". This is a generalisation which often annoys me as developers and middle management hold this as an article of faith but can be wrong/misleading. I believe the rule of thumb for optimizers, is that if the optimizer calculates more than 20% of the rows are to be returned then a full table scan is more efficient then index reads (as for each row in an "index read" you need to read the index page/block then the data page/block. So seeing a full table scan is not necessarily an indicator of poor performance!

However on the other hand I recently worked on performance problem, where a full table scan being replaced by an index read improved performance by a factor of 4000 times.

Understanding these points, helps us understand why good statistics are critical - for a large table, if the stats indicate that only a few rows should be returned, then index reads will be used; however if the stats are wrong, then the optimizer may not choose a full table scan which would be much quicker (I have seen this sort of performance problem before as well).

e) Estimated vs Actual query plans

I wasn't the only surprised when he briefly mentioned the icons for the estimated query plan and actual query plan.

I had thought that unless there is a change in the structure of the database (e.g. add a new index) or fresh database statistics are gathered or an execution plan is aged-out, then the execution plan doesn't change? I'm not aware of any dynamic feedback where by the DBMS engine detects automatically what it perceives to be "slow/inefficient sql plans", uses the information gathered during the execution and potentially comes up with a new plan!? Yes modern database engines capture workload which can be analysed using automatic tools to highlight performance bottlenecks but this is not directly feedback into the execution plan.

Unfortunately when the panel where asked (by someone else in the audience) to explain what the difference between estimated vs actual query plans was, there was quite a lot of speculation around the idea that "the statistics and the actual data might be widely different" but no one gave a satisfactory to me. I did add "my two pence worth" that, I wasn't aware that the actual execution can be used to feedback directly in to the execution plan because the statistics were misleading? However I let this point drop and decided to "goggle it" later. I think the following explanation by Microsoft's "SQL Server Customer Advisory team" is helpful

While the “Estimated” and “Actual” query plans are same in most cases, there are some exception. If a table (temporary or permanent) is created in the same batch with the query, SQL Server has to recompile the query since the query definition is unknown when the batch is compiled the first time.

http://blogs.msdn.com/sqlcat/archive/2008/10/09/table-variable-vs-temp-table-and-estimated-vs-actual-query-plan.aspx

The authors also give a nice example of where the two query plans do diverge (see link above)

In terms of Oracle Execution plans, there are direct equivalents - showing the expected executions for each element in the plan.

5) Dynamic Management Views

Oracle has always had it V$ views which allows DBAs to script performance and general database monitoring commands. These views, showing a huge gamut of database operational metrics (since instance start-up). Being able to script the extract of such information is sometimes much more practical than a GUI interface: easier to automate, easier to document, more repeatable ...

Since SQL2005 Microsoft has introduced "dynamic management views":

The dynamic management views (DMVs) in SQL Server 2005 are designed to give you a window into what's going on inside SQL Server. They can provide information on what's currently happening inside the server as well as the objects it's storing. They are designed to be used instead of system tables and the various functions provided in SQL Server 2000. This article provides an introduction to DMVs and covers a few of the basic views and functions.

http://www.sqlteam.com/article/dynamic-management-views

Iain briefly went through a few of the most useful views:

sys.dm_io_virtual_file_stats
Similar to fn_virtualfilestats in SQL Server 2000, this DMV function returns I/O statistics for data and log files for one or all databases and one or all files within the database(s).

sys.dm_os_sys_info
Returns a miscellaneous set of useful information about the computer, such as the hyper-thread ratio, the max worker count, and other resources used by and available to SQL Server.

sys.dm_os_sys_memory*
Returns a complete picture of memory at the operating system level, including information about total and available physical memory, total and available page memory, system cache, kernel space and so forth.

sys.dm_exec_requests
Returns one row for each request executing within SQL Server, but does not contain information for code that executes outside of SQL Server, such as distributed queries or extended stored procedures

sys.dm_tran_locks
Returns information about currently active requests to the lock manager, broken into a resource group and a request group. The request status may be active, convert, or may be waiting (wait) and includes details such as the resource on where the lock request wants a log (for the resource group) or the lock request itself (for the request group).

http://sqlserver.quest.com/kbcategory.jspa?categoryID=289

Iain was rather pleased to highlight one metric where SQL Server metrics have "stolen a lead" over Oracle (there appears to be a degree to rivalry between the Oracle and SQL teams at Quest, with the Oracle team being a little superior at times?). With DMV SYS.DM_DB_INDEX_USAGE_STATS can a "not in" clause you can easily identify any indexes which haven't been used since start-up:

USE AdventureWorks
GO
DECLARE @dbid INT
SELECT @dbid = DB_ID(DB_NAME())
SELECT OBJECTNAME = OBJECT_NAME(I.OBJECT_ID),
INDEXNAME = I.NAME,
I.INDEX_ID
FROM SYS.INDEXES I
JOIN SYS.OBJECTS O
ON I.OBJECT_ID = O.OBJECT_ID
WHERE OBJECTPROPERTY(O.OBJECT_ID,'IsUserTable') = 1
AND I.INDEX_ID NOT IN (
SELECT S.INDEX_ID
FROM SYS.DM_DB_INDEX_USAGE_STATS S
WHERE S.OBJECT_ID = I.OBJECT_ID
AND I.INDEX_ID = S.INDEX_ID
AND DATABASE_ID = @dbid)
ORDER BY OBJECTNAME,
I.INDEX_ID,
INDEXNAME ASC
GO

http://blog.sqlauthority.com/2008/02/11/sql-server-2005-find-unused-indexes-of-current-database/

This is probably a useful set of metrics on index usage and I wouldn't be surprised if other database vendors (i.e. Oracle) provide these details soon ....

Monday, June 22, 2009

tech: SQLServer - GUIDS and clustered index

One of the differences between Oracle and SQL Server, is the dominance of clustered tables within SQL Server applications.

Briefly the concept of a clustered table is that the data is phyisical stored in the order of the primary key. For example if your order table has a primary key on order_num in the format yyyymmddnnn i.e. if you take 37 orders on the 22rd June 2009 then the order numbers would be 20090622001, 20090623002, 20090623003,... 20090623037. In this example as new orders come through, they are added at the end of the table/index and all works well.

However one commonly reported problem within the Microsoft software development community is when the developer is using a GUID (a 16-byte unique identifier generated by Microsoft programs), that quite naturally they use their GUID field as the unique/primary key. Now as each new record / GUID is created, they are not in a sequential order and so this makes a big mess of your table storage (page splits and fragmentation). The solution in SQL2005 was the NEWSEQUENTIALID() function:

CREATE TABLE myTable (ColumnA uniqueidentifier DEFAULT NEWSEQUENTIALID())
Creates a GUID that is greater than any GUID previously generated by this function on a specified computer.
http://msdn.microsoft.com/en-us/library/ms189786.aspx

This is still an area of confusion amongst some developers, the following well-written question was posted on stackoverflow in Nov08:

Should I get rid of clustered indexes on Guid columns
I am working on a database that usually uses GUIDs as primary keys.
By default SQL Server places a clustered index on primary key columns. I understand that this is a silly idea for GUID columns, and that non-clustered indexes are better.
What do you think - should I get rid of all the clustered indexes and replace them with non-clustered indexes?
Why wouldn't SQL's performance tuner offer this as a recommendation?
http://stackoverflow.com/questions/277625/should-i-get-rid-of-clustered-indexes-on-guid-columns

The responses to the question are that (a) SQL Server works best with clustered indexes and (b) the old fragmentation problems are preventable via newsequentialid()

With SQL 2005 and newsequentialid(), the fragmentation problem goes away to a large extent. It's best to measure by looking at sys.dm_db_index_physical_stats and sys_indexes. – RoadWarrior

Great that answers how to use GUIDs and Clustered Indexes.

However I have one further question, why are clustered table the "default" but their equivalent structure within Oracle, the Index Orientated Table is rare:

Index-organized tables provide faster access to table rows by the primary key. Also, since rows are stored in primary key order, range access by the primary key involves minimum block accesses. In order to allow even faster access to frequently accessed columns, the row overflow storage option can be used to push out infrequently accessed non-key columns from the B-tree leaf block to an optional overflow storage area. This limits the size and content of the row portion actually stored in the B-tree leaf block, resulting in a smaller B-tree and faster access.
http://www.oracle.com/technology/products/oracle9i/datasheets/iots/iot_ds.html

Intuitively the constraints of having add rows in the order of the primary key, makes these sort of table tricky to work with; this may explain why these tables are not so popular in the Oracle world.

One clue may also be that "SQL Server has no command for reorganizing a heap table":

Another benefit of clustered indexes is that they assist in reorganizing a table. Unlike other database engines, SQL Server has no command for reorganizing a heap table. It only works on indexes. So, if there is a badly fragmented table, the DBA is usually stuck with executing a table purge, copying all the data out, truncating the table, and copying all the data back in. This feat is not easy to accomplish in an environment with high amounts of enforced referential integrity.

If the table has a clustered index, the DBA can simply rebuild the clustered index, which reorganizes the data pages because the leaf level of the clustered index is the data.
http://www.rampant-books.com/t_hpsdba_85_clustered_indexes_purge.htm

Sunday, June 21, 2009

tech: My IT History 1989-92 Macs, Pagemaker and Desktop Publishing

At Oxford (1989-92), I was reading Maths, with hindsight I would have reads Maths and Computing. While I was good at pure maths (calculus, linear algebra, logic) probability theory and statistics; but I never got on with abstract algebra or mechanics. If I had studied IT at university, I would have enjoyed this more and done better (well a lot better than struggling with fluid dynamics and group theory).

While I was at Oxford, I started to use Macs, Pagemaker and Desktop Publishing. I joined the Cherwell newspaper and for one term work on "Design and Production team", which typically involved converting late articles, graphics and other images into a finished newspaper before the print deadline. I had a close friend at the time (Zia Jafri) who was really talented at this sort of design work (he had some year-out experience producing reports for a marketing company in the City).

It was fun to work at Cherwell, Oxford's most self-confident students. The officers were nice, and the newspaper has a long and prestigious history. I also worked on the production of the Wadham Sound - a weekly college A3 newsletter.

Beyond this I didn't learn much about IT while at Oxford, in IT terms these were rather barren years.

tech: My IT History - 1982-88 The First Generation of Home PCs

The first IT book I ever read was a prize "30 hour basic" (I had won the school maths prize in 1982, the year I discovered my only solution for the rubic cube - "my greatest ever achievement" aged 11). I didn't really understand much more beyond the first chapter at first but I was captivated. Later when I went on develop a simple IT stock control system for my O level computer project, I went back to this book and understood the important of good modular design and using procedures and functions to avoid "spaghetti code".

In 1986, my father brought home an IBM PC AT, a physical huge machine, mostly green screen (text-based), and only the esoteric dos prompt to work from! At first none of the family had a clue how to use this machine. It was great, firstly I learn about DOS, my previous computers (BBC Model B, Commodore 64, Atari ST) had more basic operating systems with built in BASIS interpreters and uber-complex machine code (peeking and poking sprites was interesting but to tricky). Beyond the joys of DOS, my father wanted to know how to use word processors, write programs (the IBM BASIC language required which came with the machine required compiling) and use other applications, mostly text and command line based.

At school, I was studying Maths, Further Maths, Physics and Economics A levels. Good academic, logical subjects but no IT.

In 1988, I took a year-out to work as a research assistant in the Morgan Grenfell (in the heart of Londons Financial Captial - beautiful offices on Finsbury Square), then the Treasury (a magnificant building on the corner of Parliament Square and WhiteHall). In both these jobs, I was working with office applications: spreadsheets, word processors, charting tools, datastream, data query languages.. These were my first office jobs and not surprisingly I focused on IT and software, which in those days was seen as the preserve of the geeks (everything was still DOS based and so not very user friendly).

tech: Introduction to my weekly technology blog Welcome to my weekly technology blog! The idea behind this blow, is sit back at least once a week an

The first IT book I ever read was a prize "30 hour basic" (I had won the school maths prize in 1982, the year I discovered my only solution for the

rubic cube - "my greatest ever achievement" aged 11). I didn't really understand much more beyond the first chapter at first but I was captivated. Later when I went on develop a simple IT stock control system for my O level computer project, I went back to this book and understood the important of good modular design and using procedures and functions to avoid "spaghetti code".

At school, I was studying Maths, Further Maths, Physics and Economics A levels. Good academic, logical subjects but no IT.

office applications: spreadsheets, word processors, charting tools, datastream, data query languages.. These were my first office jobs and not surprisingly I focused on IT and software, which in those days was seen as the preserve of the geeks (everything was still DOS based and so not very user friendly).

tech: Introduction to my weekly technology blog

The idea behind this blow, is sit back at least once a week and focus on all the useful and interesting stuff I have learnt about computing and IT.

I have always been fascinated by computers and there is always more to learn and understand in the wonndeful world of IT.

For the last few years (since 1994) I have focused on databases (Oracle and SQL Server). Since 1997 I have been working in SAP Basis (infrastructure for SAP installations), which has a large element of DBA work and so this was a natural progression.

I also periodically help charities with there administration, building simple bespoke applications in MS Access

I am going to integrate these technology articles within my main blog, but I label each article "tech:" to clear mark technology articles.