Majid Khan

Big Data Platform Administrator
Thane, IN.

About

Highly accomplished Big Data Administrator with 4 years of expertise in the Cloudera Enterprise (CDP) ecosystem, specializing in end-to-end deployment, upgrades, and administration of complex Big Data clusters and integrated applications across the Hadoop stack. Proven ability to enhance data security, optimize performance, and drive operational efficiency through strategic monitoring, automation, and proactive issue resolution, ensuring robust and compliant data environments for critical business operations.

Work

IDFC FIRST Bank
|

Platform Admin

Summary

Led end-to-end patching, administered Cloudera Machine Learning, and ensured robust data security and compliance across Big Data operations.

Highlights

Coordinated end-to-end patching operations with systems teams, overseeing cluster startup, shutdown, and post-maintenance validation to maintain 100% stability and compliance across Big Data servers.

Designed and implemented Grafana monitoring dashboards for real-time tracking of user login status, disk space thresholds, and node performance, significantly enhancing operational visibility.

Administered Cloudera Machine Learning (CML) environment, managing user workspaces and resource allocation, and troubleshooting model deployment issues to ensure high availability for data science operations.

Led the remediation of OS and database vulnerability assessments and compliance audits, collaborating with security to apply critical patches within stipulated timelines.

Developed a Python automation script leveraging Apache Ranger APIs to proactively identify upcoming policy expirations and send automated email reminders, improving access management efficiency.

DCB BANK
|

Hadoop Admin

Summary

Deployed and managed production-grade Hadoop clusters, implemented robust security measures, and provided critical 24/7 support for Big Data systems.

Highlights

Successfully deployed a production-grade Cloudera Enterprise Hadoop Cluster CDP 7.1.9 from scratch, managing the full stack from OS to custom enhancements, boosting data management, security, and analytics functionality.

Executed a seamless data migration from an existing CDP 7.1.7 Cluster to the new CDP 7.1.9 cluster, ensuring data integrity and minimal downtime in collaboration with Solutions Architect.

Implemented comprehensive data security protocols, including Kerberos, Data in Transit Encryption, and HDFS data at rest encryption, safeguarding sensitive information across the cluster.

Proactively monitored and optimized cluster health and performance using Cloudera Manager, promptly resolving issues to ensure smooth operations across Production and UAT environments.

Provided 24/7 critical issue resolution support to the Big Data Team, consistently meeting short SLAs and minimizing operational disruptions across the full Big Data stack.

Datamatics Global Servces Ltd
|

Consultant

Summary

Administered and monitored CDH clusters, deployed POC Hadoop environments on AWS, and automated routine administrative tasks.

Highlights

Administered, supported, and monitored CDH clusters using Cloudera Manager, proactively resolving cluster issues to maintain optimal health and performance.

Deployed and configured a multi-node Hadoop Cluster POC on AWS Linux Servers, integrating Apache Spark for testing purposes and validating new functionalities.

Streamlined user onboarding processes, creating user accounts at both OS and application levels to facilitate seamless access to the cluster.

Automated routine administrative tasks, including Kerberos ticket and service user password renewals via crontab, enhancing operational efficiency and security.

Resolved P3 tickets and real-time issues for the development team, minimizing downtime and supporting continuous development efforts.

Education

Pune University

Bachelor's

Computer Engineering

Awards

Best Debutante Award for Data Science and Insights

Awarded By

DCB BANK

Certificates

Cloudera Essentials for CDP

Issued By

Cloudera

Big Data Foundations

Issued By

IBM

Intro to Machine Learning

Issued By

Cloudera

Generative AI Fundamentals

Issued By

Databricks Academy

Skills

Big Data

SQOOP, HDFS, YARN, SPARK, HIVE, IMPALA, KAFKA, Nifi.

Enterprise Hadoop

Cloudera CDH / CDP, Cloudera Machine Learning.

Security

Kerberos, TLS / SSL, Apache Ranger, Apache Atlas, Data Encryption.

BI and Monitoring Tool

SAS VIYA, Grafana.

Cloud (AWS)

S3, EC2, VPC, IAM.

OS

RedHat Enterprise Linux.