5/15/2012

Paging in SQL Server 2005 (ROW_NUMBER() - Select dữ liệu trả về trong 1 khoảng)

Introduction

Developers and database administrators have long debated methods for paging recordset results from Microsoft SQL Server, trying to balance ease of use with performance. The simplest methods were less efficient because they retrieved entire datasets from SQL Server before eliminating records which were not to be included, while the best-performing methods handled all paging on the server with more complex scripting. The ROW_NUMBER() function introduced in SQL Server 2005 provides an efficient way to limit results relatively easily.

Paging Efficiency

In order to scale well, most applications only work with a portion of the available data at a given time. Web-based data maintenance applications are the most common example of this, and several data-bindable ASP.NET classes (such as GridView and Datagrid) have built-in support for paging results. While it is possible to handle paging within the web page code, this may require transferring all of the data from the database server to the web server every time the control is updated. To improve performance and efficiency, data which will not be used should be eliminated from processing as early as possible.

Paging Methods

Many popular databases offer functions allowing you to limit which rows are returned for a given query based upon their position within the record set. For example, MySQL provides the LIMIT qualifier, which takes two parameters. The first LIMIT parameter specifies which (zero-based) row number will be the first record returned, and the second parameter specifies the maximum number of records returned. The query:

SELECT * FROM table LIMIT 20,13

...will return the 20th through the 32nd records -- assuming at least 33 records are available to return. If fewer than 33 records are available, the query will return all records from record 20 on. If fewer than 20 records are available, none will be returned.

SQL Server does not have this functionality, however the 2005 release does have a number of other new tricks. For instance, support for CLR procedures means it is possible to use existing paging methods to write VB.NET or C# code that would execute within the SQL Server environment. Unfortunately, CLR procedures are not as efficient as native Transact SQL. To ensure best performance, queries should still be written in TSQL whenever practical.

Using ROW_NUMBER()

TSQL in the 2005 release includes the ROW_NUMBER() function, which adds an integer field to each record with the record's ordinal result set number. Stated more simply, it adds the record's position within the result set as an additional field so that the first record has a 1, the second a 2, etc. This may appear to be of little value, however by using nested queries we can use this to our advantage.

To demonstrate ROW_NUMBER() and to explore how the paging solution works, create a simple salary table and populate it with random data using the following commands:

CREATE TABLE [dbo].[Salaries](
    [person] [nvarchar](50) NOT NULL,
    [income] [money] NOT NULL,
 CONSTRAINT [PK_salaries] PRIMARY KEY CLUSTERED(
    [person] ASC
)) ON [PRIMARY]
GO

INSERT INTO Salaries VALUES ('Joe', '28000')
INSERT INTO Salaries VALUES ('Sue', '96000')
INSERT INTO Salaries VALUES ('Michael', '45000')
INSERT INTO Salaries VALUES ('John', '67000')
INSERT INTO Salaries VALUES ('Ralph', '18000')
INSERT INTO Salaries VALUES ('Karen', '73000')
INSERT INTO Salaries VALUES ('Waldo', '47000')
INSERT INTO Salaries VALUES ('Eva', '51000')
INSERT INTO Salaries VALUES ('Emerson', '84000')
INSERT INTO Salaries VALUES ('Stanley', '59000')
INSERT INTO Salaries VALUES ('Jorge', '48000')
INSERT INTO Salaries VALUES ('Constance', '51000')
INSERT INTO Salaries VALUES ('Amelia', '36000')
INSERT INTO Salaries VALUES ('Anna', '49000')
INSERT INTO Salaries VALUES ('Danielle', '68000')
INSERT INTO Salaries VALUES ('Stephanie', '47000')
INSERT INTO Salaries VALUES ('Elizabeth', '23000')

The ROW_NUMBER() function has no parameters - it simply adds the row number to each record in the result set. To ensure the numbering is consistent, however, SQL Server needs to know how to sort the data. Because of this, ROW_NUMBER() must immediately be followed by the OVER() function. OVER() has one required parameter, which is an ORDER BY clause. The basic syntax for querying the Salaries table is:

SELECT ROW_NUMBER() OVER(ORDER BY person), person, income
FROM Salaries

 

This returns the following result:
 
(No column name)    person    income
1    Amelia    36000.00
2    Anna    49000.00
3    Constance    51000.00
4    Danielle    68000.00
5    Elizabeth    23000.00
6    Emerson    84000.00
7    Eva    51000.00
8    Joe    28000.00
9    John    67000.00
10    Jorge    48000.00
11    Karen    73000.00
12    Michael    45000.00
13    Ralph    18000.00
14    Stanley    59000.00
15    Stephanie    47000.00
16    Sue    96000.00
17    Waldo    47000.00
The Salaries data now appears sorted by person, and it has an extra column indicating each record's position within the results.
 
If for any reason you wanted the results to display in a different order than they were numbered in, you can include a different ORDER BY clause as part of the normal SELECT syntax:

 

SELECT ROW_NUMBER() OVER(ORDER BY person), person, income 
FROM Salaries 
ORDER BY income

This returns the following result:
 
(No column name)    person    income
13    Ralph    18000.00
5    Elizabeth    23000.00
8    Joe    28000.00
1    Amelia    36000.00
12    Michael    45000.00
15    Stephanie    47000.00
17    Waldo    47000.00
10    Jorge    48000.00
2    Anna    49000.00
3    Constance    51000.00
7    Eva    51000.00
14    Stanley    59000.00
9    John    67000.00
4    Danielle    68000.00
11    Karen    73000.00
6    Emerson    84000.00
16    Sue    96000.00
If we want to limit the results displayed to a certain range, we need to nest this SELECT inside another one and provide a name for the ROW_NUMBER() column. To limit our results to records 5 through 9, we can use the following query:

 

SELECT *
FROM   (SELECT ROW_NUMBER() OVER(ORDER BY person) AS 
       rownum, person, income FROM Salaries) AS Salaries1
WHERE  rownum >= 5 AND rownum <= 9

This returns the following result:

rownum    person    income
5    Elizabeth    23000.00
6    Emerson    84000.00
7    Eva    51000.00
8    Joe    28000.00
9    John    67000.00
Again, we can change the sort order by adding an ORDER BY clause. This is most easily accomplished by using the outer SELECT statement:

SELECT *
FROM   (SELECT ROW_NUMBER() OVER(ORDER BY person) AS
       rownum, person, income FROM Salaries) AS Salaries1
WHERE  rownum >= 5 AND rownum <= 9
ORDER BY income

This returns the following result:

rownum    person    income
5    Elizabeth    23000.00
8    Joe    28000.00
7    Eva    51000.00
9    John    67000.00
6    Emerson    84000.00
If we want to support the same type of arguments that MySQL's LIMIT() supports, we can create a stored procedure that accepts a beginning point and a maximum number of records to return. ROW_NUMBER requires that the data be sorted, so we will also have a required parameter for the ORDER BY clause. Execute the following statement to create a new stored procedure:

CREATE PROCEDURE [dbo].[pageSalaries]
  @start  int = 1
 ,@maxct  int = 5
 ,@sort   nvarchar(200)
AS
  SET NOCOUNT ON
  DECLARE
    @STMT nvarchar(max),    -- SQL statement to execute
    @ubound int

  IF @start < 1 SET @start = 1
  IF @maxct < 1 SET @maxct = 1
  SET @ubound = @start + @maxct
  SET @STMT = ' SELECT person, income
                FROM (
                      SELECT  ROW_NUMBER() OVER(ORDER BY ' + @sort + ') AS row, *
                      FROM    Salaries
                     ) AS tbl
                WHERE  row >= ' + CONVERT(varchar(9), @start) + ' AND
                       row <  ' + CONVERT(varchar(9), @ubound)
  EXEC (@STMT)              -- return requested records 

The pageSalaries procedure begins with SET NOCOUNT ON to disable the record count message (a common step for optimizing query performance). We then declare two necessary variables, @STMT and @ubound. Because we want to be able to change what ORDER BY argument is used, we need to dynamically generate our query statement by storing it in @STMT. The next lines ensure that only positive numbers are used for the starting position and maximum size, then calculate the range of ROW_NUMBER() values being requested. (If we wanted to be zero-based like MySQL's LIMIT, we could do so with a few minor tweaks.) Once the dynamic SQL command has been strung together, it is executed so that the results are returned.

Execute the following statement to test the stored procedure:

pageSalaries 4, 7, 'income'

This returns the following result:

person    income
Amelia    36000.00
Michael    45000.00
Stephanie    47000.00
Waldo    47000.00
Jorge    48000.00
Anna    49000.00
Constance    51000.00
If we execute:

pageSalaries 13, 7, 'income'

we receive back:

person    income
John    67000.00
Danielle    68000.00
Karen    73000.00
Emerson    84000.00
Sue    96000.00
... because the query goes beyond the number of records available.

Taking this one step further, we can make a stored procedure that does a more general form of paging. In fact, it can be generalized to the point that it can be used to return any collection of fields, in any order, with any filtering clause. To create this wunderkind marvel, execute the following command:

CREATE PROCEDURE [dbo].[utilPAGE]
  @datasrc nvarchar(200)
 ,@orderBy nvarchar(200)
 ,@fieldlist nvarchar(200) = '*'
 ,@filter nvarchar(200) = ''
 ,@pageNum int = 1
 ,@pageSize int = NULL
AS
  SET NOCOUNT ON
  DECLARE
     @STMT nvarchar(max)         -- SQL to execute
    ,@recct int                  -- total # of records (for GridView paging interface)

  IF LTRIM(RTRIM(@filter)) = '' SET @filter = '1 = 1'
  IF @pageSize IS NULL BEGIN
    SET @STMT =  'SELECT   ' + @fieldlist + 
                 'FROM     ' + @datasrc +
                 'WHERE    ' + @filter + 
                 'ORDER BY ' + @orderBy
    EXEC (@STMT)                 -- return requested records 
  END ELSE BEGIN
    SET @STMT =  'SELECT   @recct = COUNT(*)
                  FROM     ' + @datasrc + '
                  WHERE    ' + @filter
    EXEC sp_executeSQL @STMT, @params = N'@recct INT OUTPUT', @recct = @recct OUTPUT
    SELECT @recct AS recct       -- return the total # of records

    DECLARE
      @lbound int,
      @ubound int

    SET @pageNum = ABS(@pageNum)
    SET @pageSize = ABS(@pageSize)
    IF @pageNum < 1 SET @pageNum = 1
    IF @pageSize < 1 SET @pageSize = 1
    SET @lbound = ((@pageNum - 1) * @pageSize)
    SET @ubound = @lbound + @pageSize + 1
    IF @lbound >= @recct BEGIN
      SET @ubound = @recct + 1
      SET @lbound = @ubound - (@pageSize + 1) -- return the last page of records if                                               -- no records would be on the
                                              -- specified page
    END
    SET @STMT =  'SELECT  ' + @fieldlist + '
                  FROM    (
                            SELECT  ROW_NUMBER() OVER(ORDER BY ' + @orderBy + ') AS row, *
                            FROM    ' + @datasrc + '
                            WHERE   ' + @filter + '
                          ) AS tbl
                  WHERE
                          row > ' + CONVERT(varchar(9), @lbound) + ' AND
                          row < ' + CONVERT(varchar(9), @ubound)
    EXEC (@STMT)                 -- return requested records 
  END

You may receive the following error message from SQL Server, which you can confidently ignore:

Cannot add rows to sys.sql_dependencies for the stored procedure because it depends on the missing table 'sp_executeSQL'. The stored procedure will still be created; however, it cannot be successfully executed until the table exists.
The utilPage procedure accepts 6 parameters:

@datasrc            - the table (or stored procedure, etc.) name
@orderBy    - the ORDER BY clause
@fieldlis    - the fields to return (including calculated expressions)
@filter    - the WHERE clause
@pageNum    - the page to return (must be greater than or equal to one)
@pageSize    - the number of records per page
The stored procedure needs the name of a data source to query against (such as a table) and one or more fields to sort by (since OVER() requires an ORDER BY clause). If @filter is blank (the default), it will be set to "1 = 1" as a simple way to select all records. If @pageSize is not supplied, the query will run without paging and will not return a record count.

If, however, @pageSize is supplied, a version of the query is executed to get the total number of records. In order to have this record count available within the procedure and as a returned value, we use sp_executeSQL to support executing the statement while returning an output parameter. The record count is used to prevent returning empty results when possible, and to support paging interfaces that calculate the number of pages available (such as GridView). If we were calling this stored procedure to populate a GridView, we would return @recct as a ReturnValue parameter instead of using a result set, but we will use a result set for demonstration purposes.

The procedure calculates what the actual record positions will be for the requested page. Rather than allow the query to fail, there are safety checks ensuring that @pageSize and @pageNum are greater than zero, and that the result set will not be empty. If the specified page is out of range, this procedure will return the last possible page of records. This is helpful if a user changes more than one setting before refreshing their data, or if a significant amount of data is deleted between requests.

The remainder of the procedure is virtually identical to the pageSalaries procedure. To test the utilPAGE stored procedure, execute the following statement:

utilPAGE 'Salaries', 'person', '*', 'income > 1000', 2, 4

This returns the following two result sets:

recct
17

row    person    income
5    Elizabeth    23000
6    Emerson    84000
7    Eva    51000
8    Joe    28000
If we execute:

utilPAGE 'Salaries', 'person', 'person, income', '', 13, 3

...we receive back:

recct
17

person    income
Stephanie    47000
Sue    96000
Waldo    47000
Even though the request should be for records 36 through 38 - far outside of what is available - the procedure returns the last available page of records. In contrast, requesting the third page with seven records per page using:

utilPAGE 'Salaries', 'person', 'person, income', '', 3, 7

...returns the last three records, as the page is not completely out of bounds:

person    income
Stephanie    47000
Sue    96000
Waldo    47000
All of these examples are based on simple single-table queries, which may not reflect what you need in the real world. While the utilPAGE procedure does not support ad-hoc JOINs, it does work with SQL Views. If you want paging support for multi-table queries, you should create a View (with all of the necessary JOINs) to use as the data source. Using a View follows good design practices as it ensures that your Joins are performed consistently, allows easier ad-hoc querying from the command line, and is much easier to troubleshoot than a stored procedure's dynamic SELECT statement logic.

Conclusion


While SQL Server does not have as simple a method for paging results as some other databases, features introduced in the 2005 release have made it possible to page results efficiently more easily than ever before. In the next article in this series, we will go a step further and integrate this paging logic with a GridView through a Data Access Layer. 

------------------------------------------------------------

Demo:

set ANSI_NULLS ON

set QUOTED_IDENTIFIER ON

go

 

 

-- =============================================       

-- Author:  TaiTD     

-- Create date: 02/02/2010

-- Description: Export to XML file       

-- =============================================       

ALTER proc [dbo].[GetDELETEDJobPostingsBy_JobTypeID]       

(

      @JobTypeID int

)

As       

Begin     

 

      SELECT *

      FROM   (SELECT ROW_NUMBER() OVER(ORDER BY del.JobPostingID) AS

               rownum, del.JobPostingID, del.Title, del.Description, del.IsSearchable, JT.JobTypeID, JST.JobSubTypeID,

                  JT.JobTypeKey, JST.JobSubtypeKey, com.CompanyName, com.CompanyProfile

                  FROM DELETED_JobPostings as del

                  INNER JOIN Web_Companies as com ON del.CompanyID = com.CompanyID

                  JOIN DELETED_JobPostings_Types as delJPT ON delJPT.JobPostingID = del.JobPostingID

                  JOIN Web_JobSubtypes as JST ON JST.JobSubtypeID = delJPT.JobSubtypeID

                  JOIN Web_JobTypes as JT ON JT.JobTypeID = JST.JobTypeID ) AS OOOOO

      WHERE IsSearchable=1 and LEN(Description)>100 and rownum >= 1 AND rownum <= 170000 and JobTypeID=@JobTypeID

           

End

 

--------------

--          SELECT top 9000 del.JobPostingID, del.Title, del.Description, del.IsSearchable, JT.JobTypeID, JST.JobSubTypeID,

--          JT.JobType, JST.JobSubType, com.CompanyName, com.CompanyProfile

--          FROM DELETED_JobPostings as del

--          INNER JOIN Web_Companies as com ON del.CompanyID = com.CompanyID

--          JOIN DELETED_JobPostings_Types as delJPT ON delJPT.JobPostingID = del.JobPostingID

--          JOIN Web_JobSubtypes as JST ON JST.JobSubtypeID = delJPT.JobSubtypeID

--          JOIN Web_JobTypes as JT ON JT.JobTypeID = JST.JobTypeID

--          WHERE del.IsSearchable=1 and LEN(del.Description)>100 and JT.JobTypeID=@JobTypeID

----------------

--SELECT *

--FROM   (SELECT ROW_NUMBER() OVER(ORDER BY JobPostingID) AS

--       rownum, JobPostingID, Title FROM DELETED_JobPostings) AS del

--WHERE  rownum >= 5 AND rownum <= 9

----------------

--

--SELECT *

--FROM   (SELECT ROW_NUMBER() OVER(ORDER BY del.JobPostingID) AS

--       rownum, del.JobPostingID, del.Title, del.Description, del.IsSearchable, JT.JobTypeID, JST.JobSubTypeID,

--          JT.JobType, JST.JobSubType, com.CompanyName, com.CompanyProfile

--          FROM DELETED_JobPostings as del

--          INNER JOIN Web_Companies as com ON del.CompanyID = com.CompanyID

--          JOIN DELETED_JobPostings_Types as delJPT ON delJPT.JobPostingID = del.JobPostingID

--          JOIN Web_JobSubtypes as JST ON JST.JobSubtypeID = delJPT.JobSubtypeID

--          JOIN Web_JobTypes as JT ON JT.JobTypeID = JST.JobTypeID ) AS OOOOO

--WHERE IsSearchable=1 and LEN(Description)>100 and JobTypeID=@JobTypeID and rownum >= 5 AND rownum <= 9

------------------

Categories:
If You Enjoyed This Post Please Take 5 Seconds To Share It.

0 comments:

Post a Comment

 
  • Followers