-
Design, develop, and maintain complex data flows within Cloudera DataFlow (Apache NiFi), ensuring scalable, reliable, and high-performance data movement across systems.
-
Develop and optimize real-time and near real-time data pipelines leveraging NiFi, Kafka, and CDC technologies (e.g., Debezium, SQL-based connectors).
-
Implement integrations with internal and external systems using REST APIs, JDBC, Kafka, and other communication protocols, ensuring secure and resilient data exchange.
-
Design and manage data schemas (Avro), metadata, and lineage using Apache Atlas, ensuring full traceability and governance of data flows.
-
Define and enforce data security and access control policies using Apache Ranger in alignment with enterprise governance frameworks.
-
Monitor, troubleshoot, and optimize data pipelines for performance, reliability, and scalability, including proactive alerting and issue resolution.
-
Collaborate with data engineers, architects, and business stakeholders to define requirements, design architectures, and deliver robust data flow solutions.
-
Create and maintain technical documentation, SOPs, and runbooks for operational support and knowledge sharing.
-
Support platform lifecycle activities, including upgrades, migrations, and enhancements across CDP, NiFi, and Kafka environments.
-
Perform other related duties as assigned by the team leader.