Sunday, July 25, 2010

Backup


TYPE OF BACKUP

Full Backup
  • It will be a complete backup
Incremental (Differential) Backup
  • It will be applied on full backup and cannot be created until full backup exists
  • Each differential backup will contain all the changes since full backup has been taken
  • Header of the database has special page called DCM (Differential Change Map), this page will have the list of all pages which have been changed after full backup.
  • Once we take a full backup, DCM will be wiped out.
  • Differential Backup 1 = changes made after full backup
  • Differential Backup 2 = changes made after full backup
  • Differential Backup N = changes made after full backup
Transaction Log Backup
  • It will be applied on full backup and cannot be created until full backup exists
  • Each transactional log backup will contain the data which has been changed after full backup/last transactional log backup
  • Each transaction will have LSN (Log Sequence Number). Each transactional log backup will contain the data (COMMITTED) since last LSN which was part of full backup/last transactional log backup.
  • Any gap in LSN will corrupt taking/restoring backup, which can be solved by taking a full backup.
  • Transactional Log Backup 1 = changes made after full backup
  • Transactional Log Backup 2 = changes made after Transactional Log Backup 1
  • Transactional Log Backup 3 = changes made after Transactional Log Backup 2
  • Transactional Log Backup N = changes made after Transactional Log Backup N-1
FileGroup Backup

PROCESS OF TAKING BACKUP
  • Lock the database and lock all transactions
  • Place a mark in transaction log
  • Releases lock
  • Extracts all the pages from database files and put them in backup device
  • Lock the database and lock all transactions
  • Place a mark in transaction log
  • Releases lock
  • Extracts all the pages between marks and append them in backup

RESTORING A BACKUP
  • If it is full backup then it is very easy to restore it using a wizard.
  • If it is differential/transactional log backup, we have to follow below steps
    • Restore a full backup WITH NORECOVERY option.
    • Once you restore a backup WITH NORECOVERY, it will display like this.
    • Restore a first transactional log backup WITH NORECOVERY option
    • Restore final transactional log backup / differential backup WITH RECOVERY option which is default
    • Once backup has been restored WITH RECOVERY option, no other transactional / differential backup can be restored on top of that.

Friday, July 23, 2010

Transaction

DEFINITION

A set of actions that will succeed or fail as a batch.

FUNCTIONS (ACID)

Atomicity - nothing or all,
means, a set of actions that will succeed/fail as a batch.
means, Either all actions within a transaction will be succeeded or all actions within a transaction will be failed 

Consistency - At the end of a transaction, either a new state of data is available or original data will be available.

Isolation - During transaction (before rollback/commit), data will be hidden from any other transaction

Durability - After a transaction is committed, a final state of data will be available forever even server fails/restarts

TYPE OF TRANSACTION

Implicit - While dealing with ALTER TABLE, CREATE, DELETE, DENY, DROP, FETCH, SELECT, INSERT, UPDATE, TRUNCATE, OPEN, REVOKE SQL server applies transaction
  • By default IMPLICIT Transactions will be off means auto committed, you can make it on using following command. If you do so, you have to explicitly COMMIT/ROLLBACK after all above commands
SET IMPLICIT_TRANSACTIONS ON

Explicit -



  • You can have your own transaction within a stored procedure.
  • Each transaction will have 2 ends. Either Rollback or Commit
  • Transaction can be started using BEGIN TRANSACTION command
  • Transaction can be ended using COMMIT TRANSACTION / ROLLBACK TRANSACTION command
  • It is a good practice to use transaction whenever multiple DML operations are there. For single DML operation, no transaction is required which will be taken care by IMPLICIT transaction
  • If transaction is open on Object A, no other DML operation will be allowed on Object A from another session. Even SELECT query won’t work until WITH NOLOCK specified
  • When huge number of records are getting operated, it is not good practice to use transaction because
    1. It will give a big time performance hit
    2. All the other users will be blocked
    3. Transaction log will grow exponentially which will consume disk space like anything
Alternate solution is – perform all the operations on temp table “without” using transaction and then make temporary table and actual table in sync using transaction.
  • By default all the queries are auto committed in SQL Server
  • COMMIT will commit only last (latest) open transaction
  • ROLLBACK will roll back ALL active transactions. Any further COMMIT/ROLLBACK will throw an error saying “no corresponding BEGIN TRANSACTION”

BEGIN TRAN
    UPDATE ABC SET A = 2
    BEGIN TRAN
 UPDATE ABC SET A = 3
    ROLLBACK
    UPDATE ABC SET A = 4
COMMIT

Above code won’t work and throw an error “The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION.
  • Transactions are not applicable on variables, neither on single value variable nor on table variables.
  • So it is a good practice to check the TranCount before hitting COMMIT or ROLLBACK
IF (@@TRANCOUNT > 0)
COMMIT
  • ROLLBACK will not reset identity back to its original stage.
  • You can give the name also to Transaction

Wednesday, July 21, 2010

Performance Improvement


T-SQL SIDE

  • Include SET NOCOUNT ON inside SP
  • If table’s rowcount is required, then use sysindexes query instead of COUNT(*)
  • Try to avoid CURSOR wherever required
  • Try to use UNION ALL instead of UNION wherever required
  • Try to avoid DISTINCT wherever required
  • Try to avoid unnecessary complicated joins
  • Try to return only required columns instead of using SELECT *
  • Try to avoid IN, instead EXISTS query can be used
  • Avoid triggers
  • Try to avoid function in WHERE query or SELECT query
  • Try to avoid CONVERT or CAST function
  • If less number of records are involved, then TEMP variable can be used instead of TEMP table, otherwise go for TEMP table
  • Avoid large number of DML operations in single transaction
  • Avoid update on a table which is affecting lot of records. Divide this query in chunks and update the records in chunks which will be faster
  • Consider using the NOLOCK hint to prevent locking if the data being locked is not modified often
  • Try to avoid ISNULL wherever required
  • Avoid sorting if not required. Can it be achieved using Index?
  • Try to use in-built function instead of using system tables.
e.g. if exists (select * from sysobjects where name = 'tblTest1908')
drop table tblTest1908
if object_id('tblTest1908') is not null
drop table tblTest1908


DESIGN & MAINTENANCE SIDE
  • Have a database design as per normalization rule
  • Create Clustered Index and non-clustered based on the usage
  • Try to rebuild the indexes monthly/weekly based on the data insertion/updation
  • In worst case, denormalization
  • If appropriate, reduce lock escalation by using the ROWLOCK or PAGLOCK.
  • If you experience a large number page splits, then increase the Fillfactor which will reduce page splits as it will have more room to accommodate new data.

EXECUTION PLAN



  • Execution Plan will give you an idea which execution plan SQL Server internally uses.



    • Table Scan
Ø This means, no clustered index is being used to retrieve the data. It is using a heap structure and does a full table scan.
Ø If a table is having less number of records, then it is fine otherwise index needs to be created.
Ø Sometimes, though a non-clustered index is available, SQL server uses table scan. This will happen in following 2 scenarios.
§ If total number of retrieved rows are higher relative to total number of records available in a table
§ If total number of rows having almost a same value are coming from a table
It will be efficient when output number of rows are greater than 50 % of total number of rows
    • Index Seek
Ø This means, non-clustered index is being used to retrieve the data. It happens generally when fewer rows are being returned. It focuses on qualifying rows and qualifying pages which contain these qualifying rows. It will be efficient when output number of rows are less than 50 % of total number of rows
    • Clustered Index Seek
Ø This means, clustered index is being used to retrieve the data. This is the fastest way to get the data
    • Clustered Index Scan
Ø This is just like a table scan, only difference is – it is happening on clustered index table instead of heap structure. It happens due to same reason mentioned in table scan
    • Bookmark Lookup
Ø It is a mechanism to navigate from non clustered index row to actual data row in base table (clustered index)

    • Hash Join/Nested Loop/Merge Join



Hash
Nested Loop
Merge
When being used?
No adequate index on join columns
When only one table has index and other does not have
When both the tables have clustered index (sorted rows)
Phase
  1. Build Phase
  2. Probe Phase
Further Info
Smaller table will be build input and larger will be probe input.
Each row of smaller table will be linked with larger one and being stored in Hash Table
For each row of outer table, all the rows of inner table will be executed
Very effective for larger tables. Scans n1 + n2 rows unlike n1 X n2 as in Nested loop because the records are physically sorted. So rowID = 6 will scan only till RowID = 6, the moment it gets 7, it skips.
Types
  1. In Memory
Stores in Memory
  1. Grace
When there is no sufficient memory then it will be stored in partitions.

USING DTA (DATABASE TUNING ADVISOR)



  • Create one *.sql file with various DML operations or SPs on which you want to perform tuning

  • Open DTA from "Tools" Menu

  • Connect to a server

  • Select Database & chose tables which are being used in DML Operations or SPs you have specified in saved file

  • Select a file which you have saved in Step # 1





  • In "Tuning option", select different options and "Start Analysis"





  • It will give result like this.

Monday, July 19, 2010

Index

DEFINITION
A database object which helps database engine to retrieve the data in a faster way without scanning an entire table

B-TREE STRUCTURE


  • B means Balanced
  • B-tree is always symmetrical, with the same number of pages on both the left and right halves at each level
  • Level
    • Root – Starting page. Can be only 1
    • Intermediate – In between pages which holds only reference
    • Leaf – They can refer to clustered Index / Heap depends on availability of Clustered Index
  • Page
    • Each page will have 8192 bytes, out of this, 8060 bytes can store data
    • If clustered index has been created on INT column then one page can store max 2015 records
    • Initially only a root page will be there. It can hold upto 2015 records. The moment 2016th record gets inserted, page split will occur. Root page will be moved to leaf page, one more leaf page will be added. Half of the records will be moved to new leaf page and one root page will be created which will have reference of 1st record from each leaf page.



  1. Clustered
    1. Maximum only 1 can be created
    2. By default it will be created with Primary Key. However non-clustered index can be created on primary key with following syntax
create table Temp(a int primary key nonclustered, b varchar(10)
    1. No. of Key columns per Index : 16
    2. Index Key Record Size: 900 bytes
  1. Non-clustered
    1. Maximum 999 non clustered index can be created in SQL 2008, previously it was 249
    2. By default it will be created with Unique Key.
    3. No. of Key columns per Index : 16
    4. Index Key Record Size: 900 bytes
  1. XML
    1. Maximum 249 can be created
    2. No. of Key columns per Index : 16
    3. Index Key Record Size: 900 bytes
  1. Spatial
    1. Can be defined on GEOMATRY / GEOGRAPHY datatype
    2. Maximum 249 can be created
    3. No. of Key columns per Index : 16
    4. Index Key Record Size: 900 bytes



a. VARCHAR(MAX), NVARCHAR(MAX), VARBINARY(MAX), NTEXT, IMAGE, TEXT cannot be part of Index Key columns
b. Computed columns can be indexed unless they don’t violate datatype rules


Covered Index

- An index that is constructed such that SQL Server can completely satisfy queries by reading only the index is called a covering index.
- Can be created using INCLUDE clause
- Those columns won’t be counted in maximum 16 column count
- Maximum 1023 columns can be part of covering index
- Those columns will be counted only at leaf level
- Performance can be improved
- Example
CREATE NONCLUSTERED INDEX IX_TABLE ON TABLE (COLUMN1) INCLUDE (COLUMN2)
Filtered Index
- An index with WHERE clause
- Must be non-clustered
- Cannot be created on computed
- WHERE condition will support only >,>=,=,<,<=, IS NOT NULL, NOT NULL and multiple where conditions separated by AND/OR

Online Index Creation
When index has to be created on live production database, then it is preferable to create with ONLINE option with value OFF. By default it will be ON. When you make it off, it will keep the lock on table and prevent any kind of access.
FillFactor
Index Hint
A way to force SQL Engine to use a particular index.
SELECT * FROM TableName WITH (INDEX = IndexName)
By the amount of time, more DML operations will affect the index pages and database will have unorganized data; which will impact on performance.
DBCC SHOWCONTIG ('Domain.Domainvalue') WITH FAST, TABLERESULTS, ALL_INDEXES, NO_INFOMSGS
If LogicalFragmentation column shows percentage (between 5 to 30), reorganize is required, >30, rebuild is required.
ALTER INDEX IndexName ON TableName REORGANIZE
(Same as DBCC INDEXDEFRAG (dbid, ObjectID, IndexID)
ALTER INDEX IndexName ON TableName REBUILD
(Same as DBCC DBREINDEX (‘TableName’, ‘IndexName’)


Saturday, July 17, 2010

Constraints


Constraint
Description
Primary Key
- Set of one or multiple columns which will uniquely identify each row in the table
- All columns should be NOT NULL
- Only 1 PK is allowed
- By default Clustered index will be created.
- You can create Nonclusted index like this.

CREATE TABLE pkcheck (a int not null)
ALTER TABLE pkcheck ADD CONSTRAINT pk_pkcheck PRIMARY KEY NONCLUSTERED (a)
Foreign Key
- Set of one or multiple columns which are being used to implement the referential integrity
- Can be null
- Parent table should have a primary key which is being used as FK in child table
- datatype should match in Primary and Child table
CONSTRAINT FK FOREIGN KEY (a,b) REFERENCES PK(c,d)
- ON DELETE CASCADE option will delete from child table automatically whenever deleted from parent table
Unique
- Set of one of multiple columns which will uniquely identify each row in the table
- will automatically create NONCLUSTERD index
- If NULL values are there, only one NULL value will be allowed
Default
- Allows to specify a default value whenever a value is not passed in INSERT statement
- can be applied on NULL and NOT NULL columns

CREATE TABLE def (a INT)
ALTER TABLE def ADD CONSTRAINT def_a DEFAULT(0) FOR a
Check
- allows to specify the range
- ALTER TABLE test.OrderHeader
ADD CONSTRAINT ck_subtotal CHECK (SubTotal > 0)