November 29, 2019

How e-commerce companies can handle stampede of shoppers on Annual Sale Day?



Most of the world top  e-commerce companies host annual sale like Amazon - Prime Day, Walmart-  Black Friday Online Deals and Flipkart - Big Billion sale. No matter how much these marketplaces prepares, the full scale of consumer activity can only be known when the anticipated day arrives, often shocking expectations causing spike in transactions in span of a single second.

To handle such spike, these marketplace uses distributed database designed to serve online transaction processing. But it is again limited to individual machine capacity of database storage engines to handle a spike in transactions.

One solution is  effective use of caching and  shared storage design to improve its scalability and apply machine learning methods to predict spike in transactions to emulate the workload and analyzing QPS (queries per second) performance in the performance testing.

Creating DataFrames from CSV in Apache Spark

 from pyspark.sql import SparkSession spark = SparkSession.builder.appName("CSV Example").getOrCreate() sc = spark.sparkContext Sp...