January 23, 2011

Monitoring SQL server resources

While investigating performance issues related to SQL server resources, first identify bottleneck and then resolve that bottleneck.

1. Memory bottleneck:
SQL server uses all the memory of hosted server. Data pages resided in disk, but disks are slow , so SQL server load all the data pages into memory so that they are provided to users as quickly as possible when requested.
There are 2 types of Memory pressure.
i) External memory pressure: If sql server unable to acquire sufficient memory.
If SQLServer: Buffer manager – Total pages and SQLServer: Buffer manager – Target pages are same, SQL server has sufficient memory else if target is greater than total , then there is external memory pressure.
ii) Internal memory pressure: If sql server buffer pool( the memory location which holds data pages) is reduced in response to other processes like linked server, extended SPs, etc.

• Memory counter: Available bytes < 4 MB indicates there is memory pressure.
• VAS ( virtual address space) is made up of physical memory and memory provided by page file. If memory pages/second > 1, indicates there is memory pressure.
• SQL server: Buffer Manager – Page life expectancy < 300, then there is memory pressure
• SQL server: Buffer Manager- Buffer cache hit ratio reports the no. of pages requested by query found in sql server buffer. If it is <90, requested pages are mostly read from disk to memory and it is memory bottleneck.
• SQL server: Buffer Manager – stoles pages if greater than total target pages, there is memory pressure.
• SQL server: Buffer Manager Memory grants pending if non-zero, there is memory pressure.
• SQL server: Buffer Manager- Lazy writes- This records no. of times per second sql server relocates dirty pages from buffer pool to disk. If it is 20sec or more, then there is memory pressure.


2. CPU bottleneck:
• If % Total processor time is consistently >80, , there is CPU pressure
• If avg physical queue length > 2, there is CPU pressure
• If SQL server: SQL statastics- SQL compilations or Recompilations are more, there is CPU pressure

3. Disk bottleneck:
• If %disk time is >20ms, there is disk pressure
• If SQL server: Access Methods- Full scans > 1 second or high Page splits, there is disk pressure.
• If % Total processor time is consistently >80, there is disk pressure.

January 19, 2011

Difference between DELETE and TRUNCATE

DELETE logs the data for each row affected by the statement in the transaction log and physically removes the row from the file, one row at a time. The recording of each affected row can cause your transaction log grow massively if you are deleting huge numbers of rows. However, when you run your databases in full recovery mode, detailed logging is necessary for SQL Server to be able to recover the database to the most recent state, should a problem arise. The fact that each row is logged explains why DELETE statements can be slow.

TRUNCATE is faster than DELETE due to the way TRUNCATE "removes" rows. Actually, TRUNCATE does not remove data, but rather deallocates whole data pages and removes pointers to indexes. The data still exists until it is overwritten or the database is shrunk. This action does not require a lot of resources and is therefore very fast. It is a common mistake to think that TRUNCATE is not logged. This is wrong. The deallocation of the data pages is recorded in the log file. Therefore, BOL refers to TRUNCATE operations as "minimally logged" operations. You can use TRUNCATE within a transaction, and when this transaction is rolled-back, the data pages are reallocated again and the database is again in its original, consistent state.

Some limitations do exist for using TRUNCATE.

· You need to be db_owner, ddl_admin, or owner of the table to be able to fire a TRUNCATE statement.

· TRUNCATE will not work on tables, which are referenced by one or more FOREIGN KEY constraints.

So if TRUNCATE is so much faster than DELETE, should one use DELETE at all? Well, TRUNCATE is an all-or-nothing approach. You can't specify just to truncate those rows that match a certain criteria. It's either all rows or none.

We can have triggers associated with DELETE operation.
We cannot have triggers associated with TRUNCATE operation.

You can, however, use a workaround here. Suppose you want to delete more rows from a table than will remain. In this case you can export the rows that you want to keep to a temporary table, run the TRUNCATE statement, and finally reimport the remaining rows from the temporary table. If your table contains a column with the IDENTITY property defined on it, and you want to keep the original IDENTITY values, be sure to enabled IDENTITY_INSERT on the table before you reimport from the temporary table. Chances are good that this workaround is still faster than a DELETE operation.

You can also set the recovery mode to "Simple" before you start this workaround, and then back to "Full" one it is done. However, keep in mind that is this case, you might only be able to recover to the last full backup.

SQL Server Interview questions

1. What will be the outcome of the 2 select @@error statements?

declare @@error nvarchar(max)

set @@error = 'This is an error'
select @@error -- Select 1

if @@error > 0
print 'invalid error'
else
select @@error -- Select 2

Answer:
Msg 102, Level 15, State 1, Line 1 Incorrect syntax near '@@error'.
Explanation:
@@error is reserved by SQL Internals and can't be assigned a value as a variable. The second @@error statement
if @@error > 0
print 'invalid error'
could be true as @@error values are captured in values from 1 - 457, so running the Select @@error separately AFTER running this whole statement will return with error code 102, but the statement terminates as the first line when trying to declare @@error as a variable.

2. Is it possible, using a trigger, to update a data row before commiting the insert to a table by first updating the row in the INSERTED table ?

Answer: It is not possible to update the INSERTED or DELETED tables. The following error would occur:
Msg 286, Level 16,
The logical tables INSERTED and DELETED cannot be updated.

3. What is the output of the following queries?

DECLARE @inputValue NVARCHAR(20)
DECLARE @student TABLE
(Id INT PRIMARY KEY IDENTITY(1,1),
StudentName NVARCHAR(50),
StudentResult INT)

INSERT INTO @student
VALUES( 'Hardy', 100), ('Rocky', 98), ('Panky', 99)

--Select 1
SET @inputValue = 'Hardy'
SELECT * FROM @student WHERE StudentName IN (@inputValue)

--Select 2
SET @inputValue = 'Hardy, Rocky'
SELECT * FROM @student WHERE StudentName IN (@inputValue)

Answer:
Select1 returns 1 row; select2 return 0 row
Explanation:
IN command Determines whether a specified value matches any value in a subquery or a list. In the first Select query, Query Analyzer generates query (replace variable with its value) like
SELECT * FROM @student WHERE StudentName IN ('Hardy')
Same way, for the second query, query analyzer generates query (replace variable with its value) like:
SELECT * FROM @student WHERE StudentName IN ('Hardy, Rocky')
Because there is no student whose name is 'Hardy, Rocky', sql doesn't return any data.
So, the final result is: Select1 returns 1 row; select2 return 0 row.

4. What will be the result of the following?

DECLARE @MyTable TABLE (Col1 INT,
Col2 INT)

-- Col1 Col2
INSERT INTO @MyTable VALUES (1, 2),
(3, 4)

UPDATE @MyTable SET Col1 = Col2, Col2 = Col1

SELECT Col1, Col2 FROM @MyTable

Answer:
Values of Col1 and Col2 are swapped

Creating DataFrames from CSV in Apache Spark

 from pyspark.sql import SparkSession spark = SparkSession.builder.appName("CSV Example").getOrCreate() sc = spark.sparkContext Sp...