August 12, 2014

Multiple Ways to Improve SQL Query Performance

Multiple Ways to Improve SQL Query Performance

1. Re-writing SQL Query
Re-writing SQL Query by following best practices for getting the fastest response like  avoid writing a SQL query using multiple joins ( JOIN containing more than 4 tables). ( Pls refer Nested Loop http://avishkarm.blogspot.in/2014/08/what-is-nested-loops-join.html)
Try to remove cursors from the query and use set-based query.
Remove non-correlated scalar sub query as a separate query instead of part of the main query and store the output in a variable.
Avoid Multi-statement TVFs are more costly than inline TFVs.

2. Indexes
Index can magically reduce the data retrieval time and help to improve SQL query performance and give you best query response time.
Keep in mind that clustered index should be created on a unique column.
A non-clustered index is most useful if the ratio of qualifying number of rows/total number of rowsis around 5% or less, which means if the index can eliminate 95% of the rows from consideration. As a best practice, most selective columns should be placed leftmost in the key of a non-clustered index.
Dropping unused indexes can help to speed up data modifications without affecting data retrieval.

3. Statistic Creation and Updates

4.Revisit Your Schema Definitions
Availability of the right constraint(FORIGEN KEY, NOT NULL and CEHCK constraints ) on the right place always helps to improve the query performance, like FORIGEN KEY constraint helps to simplify joins by converting some outer or semi-joins to inner joins and CHECK constraint also helps a bit by removing unnecessary or redundant predicates.

No comments:

Creating DataFrames from CSV in Apache Spark

 from pyspark.sql import SparkSession spark = SparkSession.builder.appName("CSV Example").getOrCreate() sc = spark.sparkContext Sp...