OutdefineJoin Now

Rajnesh Bhatia

ETL Developer (Extraction, Transformation, Loading)

3-5 yearsKarachi, Pakistan
About Rajnesh Bhatia
Professional Data and Devops Engineer
ETL Developer (Extraction, Transformation, Loading)
Experience level
Mid-level3-5 yrs
Hourly rate
Open to
remotehybridonsitefull time contract
PythonSparkAWSGoogle CloudTransforming dataAzureDatabase systemsSQLData pipelineData warehousing

National University of Science and Technology

Computer Science

Bachelor's DegreeClass of 2019

Data Engineer


full time contract5/2022 - 11/2022
  1. Evaluated business needs and objectives.
  2. Designed and developed end-to-end data pipeline using cloud infrastructure along with Apache tools.
  3. Worked on improvements of products and processes.
  4. Debug and resolve complex issues, and recommend improvements to ensure smooth ETL pipelines.
  5. Optimized spark jobs and automated existing pipelines to get the most in terms of performance.
  6. Updated pipelines and scripts making it flexible for different sources of data.
  7. Refactored existing code with new techniques with better cluster configuration.
  8. Configured platforms to create reports and dashboards on the top of data on Tableau.
  9. Used Data loss prevention (DLP) API on BQ datasets to find PII.
  10. Used docker images to run some pipelines on Kubernetes cluster on different frequencies.
  11. Monitored and orchestrated pipelines of all the teams using Airflow
  12. Mostly languages and framework used are SQL, Python, Spark, Tableau, Google Data Studio, Docker and Kubernetes

Consultant Data Analytics - Data Engineer

Systems Limited (Visionet Systems)

full time contract5/2019 - 5/2021
  1. Designed data Ingestion in different formats (excel, csv, txt, parquet, .gz, etc.) from different sources (SFTP, DBMS, Cloud storages) into data lake through Apache Nifi.
  2. Build different data pipeline using NIFI for different file system like Parquet, JSON, CSV and others.
  3. Executed Post Ingestion which includes ETL (data hardening, etc.) and places it onto given storages
  4. Created external tables in Hive and reports on the top of processed data.
  5. These Spark jobs are scheduled, monitored and orchestrated on Apache Airflow.
  6. Worked on Dynamic Data dictionary approach for Post Ingestion scripts that requires minimal code changes resulting in quick deployments. This includes automation of the processes.
  7. Focused on Spark code and processing optimizations and automations to minimize the usage of cluster resources resulting in parallelism work approach.
  8. Tuned Spark cluster to handle more complex operations on Peta bytes level.
  9. Moved spark to managed Apache Maven from YARN resource manager.
  10. Worked on different projects in parallel including Regeneron and Mattress firm.
  11. Created a pipeline that notifies Google pub sub on getting an email with csv attachment on authorized account.
  12. Created pub sub triggers on event based function that takes data from email and upload it on Google Cloud Storage.
  13. Another event is triggered on uploading a file that takes file from GCP storage and uploads into a Big query table in particular dataset
  14. Performed dual roles of Data and Devops engineer on same project.
  15. Worked on multiple projects at the same time including Regeneron, Mattress firm, Khaadi.
  16. Created reports, dashboards on data from different sources in Tableau and worked on workbooks.
  17. Automated reporting and publishing mechanism along with scheduled notifications.
  18. Optimized dashboards on Tableau using tuning approaches, find & resolve complexities using troubleshoot and techniques respectively.
  19. In Mattress firm, multiple sources of data are integrated with Azure DF and it was server less and that data was processed into Databricks in PySpark.
  20. The stream data (in KBs) of retail Khaadi for multiple stores is used to create dashboards and analytics using Azure Stream Analytics.
  21. Mainly used frameworks and tools are PySpark, Tableau, Apache tools like Nifi and Airflow, AWS, GCP and Azure cloud infrastructure.
Log in or sign up to connect with this talentJoin a community of like minded individuals and start owning your careerJoin now