• Experienced in data modeling, processing various data sources, analyzing historical data and develop summarized statistical reports using apache Hive/ Spark.
• Worked in Migration of Hive jobs in PySpark using Spark Dataframe, performance improvement using Spark SQL using Pyspark2.3 in Hortonworks.
• Forecasting insurance provider’s claims cost using historical data, accurate customer segmentation summarized reports using PySpark.
• Experienced in analyzing datasets from data source, creating aggregated reports using pandas dataframe, numpy statistical functions and exported CSV file in Python.