January 21, 2010

Pros and Cons of stored procedures

* Stored procedures generally result in improved performance because the database can optimize the data access plan used by the procedure and cache it for subsequent reuse.
* Stored procedures can be individually secured within the database. A client can be granted permissions to execute a stored procedure without having any permissions on the underlying tables.
* Stored procedures result in easier maintenance because it is generally easier to modify a stored procedure than it is to change a hard-coded SQL statement within a deployed component.
* Stored procedures add an extra level of abstraction from the underlying database schema. The client of the stored procedure is isolated from the implementation details of the stored procedure and from the underlying schema.
* Stored procedures can reduce network traffic, because SQL statements can be executed in batches rather than sending multiple requests from the client.


The advantages of using stored procedures are:

1) Network Bandwidth: Assume that the application server(s) and the database server(s) are separate servers. Since the source code is actually stored on the database server, the application only needs to send the name and the parameters for executing it and this in turn reduces the amount of data that is sent to the server. When you use embedded SQL or dynamically generated SQL through an ORM, then the full source of commands must be transmitted and in a scenario where there is a lot of concurrent activity going on and the load on the system requires a lot of users, this can very soon become a performance bottleneck. This can be mitigate in part by a judicious usage of views as well.

2) Abstraction Layer: It helps in separating the business logic from data manipulation logic. Since the interface to the application remains the same, changes done internally to the stored procedures/packages do not effect the application and in turn leads to easy deployment of changes.

3) It offers simplified batching of the commands. Since stored procedures/packages are meant to do a “Unit of Work”, this procedural logic can be simple to maintain and offers additional advantages like making use of the rich feature functionality that the database engines provide. SQL is a SET based language and using SET based procedural logic is the easiest and most performant way of dealing with the data. With every new release of Oracle, SQL Server or DB2 LUW, new features are being introduced in PL/SQL, T-SQL and/or SQL/PL which makes handling of different types of requirements very easy in the DB layer code.

4) Increased data security and integrity: One can secure the tables for direct access and only grant privileges on the stored procedures/packages.

5) By using bind variables (parameterized queries), SQL injection issues are mitigated as well.

6) Code Instrumentation and tracing logic can be built in very easily using the stored procedures. This is one thing that we implemented for one of our clients recently. We created a table which had a list of the DB code that was being used in the schema and this table had a trace_flag column in it which could have 4 different values:

0 (no code instrumentation),

1 (log the sequence of events),

2 ( log the sequence of events and the time taken by those SQL statements),

3 ( log the sequence of events + the time taken + the execution plan from that point of time – since the execution plan can very easily be different at the time of execution under a load scenario vs when you actually run it separately), and

4 (Start the trace – example: starting 10046 level 12 trace in the case of Oracle).

Using this logic, code instrumentation and troubleshooting production issues became very easy. One could then run reports against the data that was logged and present it to the end user or the support personnel. Code instrumentation can be done in the application tier as well using the same guidelines (or using logging blocks like MSFT logging block in .Net) and a good programmer would always instrument their code. However, for the DB code, this code instrumentation becomes a lot more easier to implement.

Cons of using stored procedures:

1) If your application runs on multiple RDBMS, example: You are a vendor and you need to provide your product that runs on Oracle, SQL Server and DB2 LUW in order to expand your customer base, then in that scenario, you have to code or put fixes in for three different code bases. Not only that, you need to have proper staffing to ensure that the code written is optimal since the locking and concurrency mechanisms are very different between these RDBMS. Also, the language used by all these “big 3″ is very different as well.

2) We have seen client shops which offload all of their CRUD operations on the DB tier – as a result they end up with one-liner stored procedures and if you have say 200 objects, you now have 200 * 4 (select/insert/update/delete) stored procedures or one procedure per object with the flag option to indicate the operation and need to code the procedure to use dynamic SQL in order to take into account the conditional parameter logic. Maintenance becomes a nightmare in this case. Also, developers/DBAs sometimes get carried away with this approach and forget that SQL is a set based language – one example is that using this scheme, a client shop was doing purges of the data and kept calling the delete procedure by passing in the ID (surrogate Key) value when they actually could have purged and archived the data using a SET based logic. Procedures are supposed to do unit of work – having one liner procedures and invoking that many calls in a batch does not yield any benefit. In addition, it has to un-necessarily incur the cost of checking permissions and plan associated with that one liner procedure – the cost is albeit a very miniscule one.

3) Parsing of strings is not one of the forte’s of the DB languages – that code is better suited in the application tier unless you start using CLR or Java code.

So, the bottom line is that one needs to carefully evaluate which portion(s) of the application really belong as stored procedure/packages. In applications that work on volume of data and do bulk data processing, it is always advisable to have that logic in stored procedures/packages that reside on the DB layer so that one can take advantage of the SET based approach of the SQL language.

One can also use ORMs (Object Relational Mappers) like Hibernate to prepare their data access strategy – one can then extend it to make calls to the database stored procedure (if so desired in certain situations), have externalized SQL statements, have dynamic SQL statements being prepared by the ORM etc.. One just has to make the right decision depending upon the application and the requirements that are needed.

Another thing to remember is when people point out that an advantage of stored procedures is that the code is always pre-compiled, that is not always true, there can be scenarios that can lead to re-compilation. Also, if proper bind variables are being used for the queries built using an ORM, it serves the same purpose (i.e. has the same advantage as that of a parsed/compiled stored procedure query) since the plan for that SQL statement is parsed and compiled. Depending upon a lot of factors (cache sizes, concurrent load, statistics updates etc.), that plan may or may not be available the next time the call gets made.

A good mix of an ORM plus DB code is usually the best compromise to get the best of both the worlds. Deciding what goes where is the key and though there are guidelines on that, it can vary from application to application depending upon the requirements.

January 06, 2010

Maximum Capacity Specifications for SQL Server

Maximum Capacity Specifications for SQL Server

The following tables specify the maximum sizes and numbers of various objects defined in SQL Server components.

Database Engine Objects
The following table specifies the maximum sizes and numbers of various objects defined in SQL Server databases or referenced in Transact-SQL statements.

SQL Server Database Engine object Maximum sizes/numbers SQL Server (32-bit) Maximum sizes/numbers SQL Server (64-bit)
Batch size1
65,536 * Network Packet Size
65,536 * Network Packet Size

Bytes per short string column
8,000
8,000

Bytes per GROUP BY, ORDER BY
8,060
8,060

Bytes per index key2
900
900

Bytes per foreign key
900
900

Bytes per primary key
900
900

Bytes per row8
8,060
8,060

Bytes in source text of a stored procedure
Lesser of batch size or 250 MB
Lesser of batch size or 250 MB

Bytes per varchar(max), varbinary(max), xml, text, or image column
2^31-1
2^31-1

Characters per ntext or nvarchar(max) column
2^30-1
2^30-1

Clustered indexes per table
1
1

Columns in GROUP BY, ORDER BY
Limited only by number of bytes
Limited only by number of bytes

Columns or expressions in a GROUP BY WITH CUBE or WITH ROLLUP statement
10
10

Columns per index key7
16
16

Columns per foreign key
16
16

Columns per primary key
16
16

Columns per nonwide table
1,024
1,024

Columns per wide table
30,000
30,000

Columns per SELECT statement
4,096
4,096

Columns per INSERT statement
4096
4096

Connections per client
Maximum value of configured connections
Maximum value of configured connections

Database size
524,272 terabytes
524,272 terabytes

Databases per instance of SQL Server
32,767
32,767

Filegroups per database
32,767
32,767

Files per database
32,767
32,767

File size (data)
16 terabytes
16 terabytes

File size (log)
2 terabytes
2 terabytes

Foreign key table references per table4
253
253

Identifier length (in characters)
128
128

Instances per computer
50 instances on a stand-alone server for all SQL Server editions except for Workgroup. Workgroup supports a maximum of 16 instances per computer.

SQL Server supports 25 instances on a failover cluster.
50 instances on a stand-alone server.

25 instances on a failover cluster.

Length of a string containing SQL statements (batch size)1
65,536 * Network packet size
65,536 * Network packet size

Locks per connection
Maximum locks per server
Maximum locks per server

Locks per instance of SQL Server5
Up to 2,147,483,647
Limited only by memory

Nested stored procedure levels6
32
32

Nested subqueries
32
32

Nested trigger levels
32
32

Nonclustered indexes per table
999
999

Number of distinct expressions in the GROUP BY clause when any of the following are present: CUBE, ROLLUP, GROUPING SETS, WITH CUBE, WITH ROLLUP
32
32

Number of grouping sets generated by operators in the GROUP BY clause
4,096
4,096

Parameters per stored procedure
2,100
2,100

Parameters per user-defined function
2,100
2,100

REFERENCES per table
253
253

Rows per table
Limited by available storage
Limited by available storage

Tables per database3
Limited by number of objects in a database
Limited by number of objects in a database

Partitions per partitioned table or index
1,000
1,000

Statistics on non-indexed columns
30,000
30,000

Tables per SELECT statement
Limited only by available resources
Limited only by available resources

Triggers per table3
Limited by number of objects in a database
Limited by number of objects in a database

Columns per UPDATE statement (Wide Tables)
4096
4096

User connections
32,767
32,767

XML indexes
249
249


1Network Packet Size is the size of the tabular data stream (TDS) packets used to communicate between applications and the relational Database Engine. The default packet size is 4 KB, and is controlled by the network packet size configuration option.

2The maximum number of bytes in any index key cannot exceed 900 in SQL Server. You can define a key using variable-length columns whose maximum sizes add up to more than 900, provided no row is ever inserted with more than 900 bytes of data in those columns. In SQL Server, you can include nonkey columns in a nonclustered index to avoid the maximum index key size of 900 bytes.

3Database objects include objects such as tables, views, stored procedures, user-defined functions, triggers, rules, defaults, and constraints. The sum of the number of all objects in a database cannot exceed 2,147,483,647.

4Although a table can contain an unlimited number of FOREIGN KEY constraints, the recommended maximum is 253. Depending on the hardware configuration hosting SQL Server, specifying additional FOREIGN KEY constraints may be expensive for the query optimizer to process.

5This value is for static lock allocation. Dynamic locks are limited only by memory.

6If a stored procedure accesses more than 8 databases, or more than 2 databases in interleaving, you will receive an error.

7If the table contains one or more XML indexes, the clustering key of the user table is limited to 15 columns because the XML column is added to the clustering key of the primary XML index. In SQL Server, you can include nonkey columns in a nonclustered index to avoid the limitation of a maximum of 16 key columns. For more information, see Index with Included Columns.

8 SQL Server supports row-overflow storage which enables variable length columns to be pushed off-row. Only a 24-byte root is stored in the main record for variable length columns pushed out of row; because of this, the effective row limit is higher than in previous releases of SQL Server. For more information, see the "Row-Overflow Data Exceeding 8 KB" topic in SQL Server Books Online.

Maximum in Oracle:

Maximum in Oracle:

Database file size Maximum Operating system dependent. Limited by maximum
operating system file size; typically 222 or 4 MB blocks


Bigfile Tablespaces Number of blocks 232 (4 GB) blocks
Smallfile (traditional)
Tablespaces
Number of blocks 222 (4 MB) blocks


GROUP BY clause Maximum length The GROUP BY expression and all of the
nondistinct aggregate functions (for example, SUM,
AVG) must fit within a single database block.

Indexes Maximum per table Unlimited
total size of indexed column 75% of the database block size minus some
overhead

Columns Per table 1000 columns maximum
Restrict the number of columns less than 255 and the performance will be good.

Per index (or clustered index) 32 columns maximum
Per bitmapped index 30 columns maximum
Constraints Maximum per column Unlimited
Subqueries Maximum levels of subqueries
in a SQL statement
Unlimited in the FROM clause of the top-level
query
255 subqueries in the WHERE clause
Partitions Maximum length of linear
partitioning key
4 KB - overhead
Maximum number of columns
in partition key
16 columns
Maximum number of partitions
allowed per table or index
64 KB - 1 partitions
Rows Maximum number per table Unlimited

Tables Maximum per clustered table 32 tables
Maximum per database Unlimited

January 05, 2010

Compare MS SQL server Vs Oracle 10g

SQL Server:

Pros:
Its easy installation, self-tuning capabilities, graphical administration, integrated business intelligence and extensive collection of help wizards
The total cost of ownership (TCO) of SQL Server 2005 is lower than that of Oracle
Specialised index on a computed column
Indexed View

Cons:
Locking and concurrency: SQL Server has no multi-version consistency model, which means that "writers block readers and readers block writers" to ensure data integrity

Performance and tuning:
a. DBA has no "real" control over sorting and cache memory allocation. The memory allocation is decided only globally in the server properties memory folder, and that applies for ALL memory and not CACHING, SORTING, etc.
b. All pages (blocks) are always 8k and all extents are always 8 pages (64k). This means you have no way to specify larger extents to ensure contiguous space for large objects.
c. In SQL Server, no range partitioning of large tables and indexes. In Oracle, a large 100 GB table can be seamlessly partitioned at the database level into range partitions.

With SQL Server 2005, INSERT, UPDATE, and DELETE statements are executed serially (MERGE is not supported).

Oracle:
Pros :
you can use Oracle on multiple platforms. Whereas Microsoft created SQL Server to be used on the Microsoft platform only, Oracle is available on multiple platforms, including Windows, Unix and now Linux, which is the foundation of Oracle's Real Application Clusters (RAC) strategy.

Locking and concurrency: "readers don't block writers and writers don't block readers." This is possible without compromising data integrity because Oracle will dynamically re-create a read-consistent image for a reader of any requested data that has been changed but not yet committed. In other words, the reader will see the data as it was before the writer began changing it (until the writer commits).

function-based indexes
Oracle will execute INSERT, UPDATE, DELETE, and MERGE statements in parallel when accessing both partitioned and non-partitioned database objects

Cons:
"Implementation of something similar to MSSQL Identity by using Oracle sequence would require reflecting the sequence name in the application or creating a trigger for each table/sequence pair.
Cost is higher
Required skilled DBA.

CONCURRENCY MODEL
concurrency control. The main differences are summarized in the table below:
Oracle Database 10g SQL Server 2005
Multi-version read consistency Always enabled. Not by default.
Must be enabled.
Non-escalating row-level locking Yes Locks escalate

SQL Server 2005 introduces two new isolation levels3:
read committed with snapshots (statement-level read consistency)
snapshot isolation (transaction-level read consistency)
These isolation levels correspond to Oracle’s READ COMMITTED and SERIALIZABLE isolation levels, respectively

December 21, 2009

What SQL server DBA should know?

 About SQL Server Versions
 New Installation
 New Systems – Setup Options
 Diving Into an Existing Installation
 Locate SQL Server Instances
 Check For Service Packs Installed
 Performance Monitor
 Performance Monitor Checks
 Disk Space Checks
 Check on Jobs
 Check On Maintenance Plans
 Check For Off-System Backups
 File Sizes, Locations and Trending
 Reviewing Security
 Best Practices Tools
 Best Practices Analyzer
 Best Practices Analyzer Targets
 Microsoft Baseline Security Analyzer 2.1
 Identify Logins
 Firewalls, Access Protection
 Disaster Recovery Plan
 Understand Backup Models
 Recovery Model: Full
 Recovery Model: Simple
 Recovery Model: Bulk_logged
 Disaster Planning and Testing
 Manual Processes
 Best Practices
 Third-Party Tools…
 SQL Server Profiler
 Profiler Tips
 Database Engine Tuning Advisor
 What Is An Index?
 What Happens Without an Index?
 Add an Index…
 How Do I Figure Out Where I Need Indexes?
 How Do I Create Indexes?
 Creating Indexes - MS
 Options – Creating Indexes
 Creating Indexes – T-SQL
 Clustered vs. Non-Clustered
 Clustered Index Tips
 "Covered" Indexes
 When NOT To Index
 Index Maintenance
 Index Checking
 Index Defragmentation
 Index Defragmentation in SQL2k5
 About Joins
 A Note About "Aliasing"
 Simple Joins
 Basic Joins
 Join Tips

November 26, 2009

Defragmenting Indexes in SQL Server 2005

There are several things that should be done on a regular basis and one of these things is to manage database fragmentation. Depending on the tables, queries and indexes that are being used fragmentation can cause performance issues as well as using unnecessary space in the database. Database fragmentation is similar to disk fragmentation in that the data is stored in various places in the database file instead of sequentially or next to like data within the database. This often occurs where you have non-sequential keys and the constant inserting, updating and deleting of data causes the data to become fragmented as well as the use of additional data pages to store the data.
As the indexes are modified, the information stored in them becomes fragmented, resulting in the information being scattered across the data files. When this occurs, the logical ordering of the data no longer matches the physical ordering, which can lead to a deterioration of query performance.
To fix this problem, indexes must be periodically reorganized or rebuilt (defragmented) so the physical order of the leaf-level pages matches the logical order of the leaf nodes.

So where is the information stored?
To analyze SQL Server 2005 or 2008 indexes, you use the system function sys.dm_db_index_physical_stats to determine which indexes are fragmented and the extent of that fragmentation.

The sys.dm_db_index_physical_stats function returns a number of values that provide details about the indexes you specify. avg_fragmentation_in_percent: Percentage of the logical index that is fragmented.
fragment_count: Number of fragments in the leaf level.
avg_fragment_size_in_pages: Average number of pages in a leaf-level fragment.
page_count: Number of index or data pages.
An index always has at least one fragment (fragment_count). The maximum number of fragments that an index can have is equal to the number of pages (page_count).
Ideally, the avg_fragmentation_in_percent value should be as close to zero as possible, and the avg_fragment_size_in_pages should be as high as possible.
Microsoft recommends that you reorganize your index if the avg_fragmentation_in_percent value is less than or equal to 30% and rebuild the index if the value is greater than 30%.
Reorganizing Indexes
When you reorganize an index, SQL Server physically reorders the leaf-level pages to match the logical order of the leaf nodes. The process uses the existing pages only and does not allocate new ones, but it does compact the index pages. In addition, reorganization uses minimal resources and is automatically performed online, without blocking queries or updates. You should reorganize indexes only if they’re lightly fragmented, otherwise, you should rebuild them
ALTER INDEX PK_StoreContact_CustomerID_ContactID ON AdventureWorks.Sales.StoreContact
REORGANIZE

Rebuilding Indexes
Rebuilding an index is generally more effective than reorganizing it because it drops the original index and builds a new one. As a result, the index starts clean with minimal fragmentation and the pages are compacted, and new pages are allocated as needed. In addition, you can choose to rebuild an index offline (the default) or online.
ALTER INDEX IX_StoreContact_ContactID ON
AdventureWorks.Sales.StoreContact
REBUILD

SQL Server script to rebuild all tables which are fragmented.
-- =============================================
-- Author: Avishkar Meshram
-- Create date: 11/26/2009
-- Description: Check fragmentation in table and rebuild table
-- =============================================
CREATE PROCEDURE [dbo].[RebuildFragmentedTables]
-- Add the parameters for the stored procedure here
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
DECLARE @Table VARCHAR(255)
DECLARE @cmd NVARCHAR(500)
DECLARE @dbid smallint;
DECLARE @FragCount smallint; -- Fragmentation level in table
SET @cmd = 30
Create table #RebuildTable
(
Table_Name varchar(50)
)
-- Gather tables which are fragmented
INSERT INTO #RebuildTable
SELECT OBJECT_NAME(OBJECT_ID)
FROM sys.dm_db_index_physical_stats(DB_ID(),
NULL, NULL, NULL , 'DETAILED')
WHERE avg_fragmentation_in_percent > 30

SET @cmd = 'DECLARE TableCursor CURSOR FOR
SELECT Distinct Table_Name
FROM #RebuildTable'
-- Rebuild only those tables which are fragmented
EXEC (@cmd)
OPEN TableCursor
FETCH NEXT FROM TableCursor INTO @Table
WHILE @@FETCH_STATUS = 0
BEGIN
SET @cmd = 'ALTER INDEX ALL ON ' + @Table + ' REBUILD'
PRINT (@cmd)
--EXEC (@cmd)
FETCH NEXT FROM TableCursor INTO @Table
END
CLOSE TableCursor
DEALLOCATE TableCursor
DROP TABLE #RebuildTable
END

July 22, 2009

SQL SERVER – 2005 – List Tables in Database With/Without Primary Key

USE DatabaseName;
GO
SELECT SCHEMA_NAME(schema_id) AS SchemaName,name AS TableName
FROM sys.tables
WHERE OBJECTPROPERTY(OBJECT_ID,'TableHasPrimaryKey') = 0
ORDER BY SchemaName, TableName;
GO

SELECT i.name AS IndexName,OBJECT_NAME(ic.OBJECT_ID) AS TableName,COL_NAME(ic.OBJECT_ID,ic.column_id) AS ColumnName
FROM sys.indexes AS i
INNER JOIN sys.index_columns AS ic
ON i.OBJECT_ID = ic.OBJECT_ID
AND i.index_id = ic.index_id
WHERE i.is_primary_key = 1

Add windows user to sysadmin role to SQL server 2005

A SQL Server instance can contain multiple user databases. Each user database has a specific owner; the owner defaults to the database creator. By definition, members of the sysadmin server role (including system administrators if they have access to SQL Server through their default group account) are database owners (DBOs) in every user database. In addition, there is a database role, db_owner, in every user database. Members of the db_owner role have approximately the same privileges as the dbo user.

To explicitly grant SQL Server 2005 logins directly to the service accounts that are used by SQL Server 2005 and by SQL Server Agent.
CREATE LOGIN [FFSERVER\Calldp] FROM WINDOWS WITH DEFAULT_DATABASE=[master]

Use an account that is a member of the SYSADMIN fixed server role to provision the logins that you added in above step in the SYSADMIN fixed server role.
EXEC master..sp_addsrvrolemember @loginame = N'FFSERVER\Calldp', @rolename = N'sysadmin'

July 19, 2009

To get table and column which are using particular data type

select object_name(c.id),c.name
from sys.objects o,
sys.syscolumns c,
sys.types t
where o.object_id=c.id
and c.xtype=t.system_type_id
and t.name='varbinary'
and o.type='U'