Thursday, 14 April 2011

SQL Data Pages

So what are Data Pages ?

The data stored within SQL Server resides in data files. For each database, there must be a minimum of one data file (We won't go into transaction logs in this blog). This data file, is like any other file and sits within the file system in a location, chosen by you. Within these data files are groups of extents. So what are extents you ask ! Extents are basically a grouping of data pages, and there are 8 data pages to each extent. Data pages are essentially where your data resides within the data file(s). Just as an FYI now that I think about it, why are data pages grouped into extents. One of the reasons is for lock escalation, which I'll be going into detail in a later blog. If for instance you are doing some operation in SQL which requires page locks, and SQL had to lock 6 out of 8 pages in an extent, the SQL engine will place a lock on the whole extent. It's easier to manage 1 lock on an extent, rather than 6 individual locks on the pages. 


Each data page is 8kb in size.

So, let's have a closer look. I have taken the sample AdventureWorks database (SQL 2008 R2) as an example, and we'll be looking at the Person.PhoneNumberType table. 

There is a command DBCC IND. This command takes three paramaters, 1) Database Name, 2) Table and 3) IndexID
DBCC IND(@Database, @Table, @IndexID).

In our case, we can execute
DBCC IND(AdventureWorks, [Person.PhoneNumberType], -1)
-1 will give us all values here

Here is a partial output  of the DBCC IND command.

PagePID is a unique identifier for the page, and IAMPID is a parent & child relationship between the PagePID's. PageType of 10, and IAMPID of NULL indicates that this page is a root page which has no parent. You'll notice the second IAMPID is 823, which relates to the first PagePID. PageTypes 1,2 and 10 relate to Data Pages, Index Pages, and IAM Pages.  
Now let's look at the actual data page itself. In order to view the data page we need to turn on a trace flag, as follows:
DBCC TRACEON(3604)
DBCC PAGE takes 4 parameters. These are the database name, the file ID (in our case this is 1 as we just have the 1 file), the PagePID, and the Output Format (We'll use 1 here to show everything). 
DBCC PAGE(AdventureWorks2008R2, 1, 822, 1)  
There are three main areas to a data page, the header, which is always 96 bytes, the data rows or SLOTS , and the Off Set

So, as a brief overview, the header basically provides information on the type of data page that we're looking at. The data rows/slots gives information about the actual data stored in the table, and the Off Set, is like an index. It's like a pointer to enable SQL to quickly allocate where the actual data rows are stored.  A snapshot of the output to the above command looks like this.

You'll notice at the bottom there is the Off Set area. Row 0 with bytes 0 - 96, as we said, is taken up with information for the header of the data page. From bytes 97 to 127 is taken up with Row 1, and after that Row 2. If we have a look further up, we can see under Slot 1 (which is Row 2 in the offset) , the actual record with a value of "Home", where if you do a SELECT * FROM Person.PhoneNumberType, you'll see the second row has a value of "Home"
Hopefully that gives you a basic understanding of data pages. For more low-level information, I recommend the following video for viewing.

No comments:

Post a Comment