Our client is looking for a knowledgeable, experienced and motivated Senior Data Engineer.
As a Data Engineer, you will be working at the AI Center of Excellence (AI CoE).
You will play a pivotal role in operationalizing the most-urgent data and analytics initiatives for the client AI strategy.
The bulk of the work would be in building, managing and optimizing data pipelines and then moving these data pipelines effectively into production for key data, analytics, and AI consumers.
You need to guarantee compliance with data governance and data management requirements while creating, improving and operationalizing these integrated and reusable data pipelines. This would enable faster data access, integrated data reuse and vastly improved time-to-solution for the client's data, analytics, and AI initiatives.
- Building data pipelines
- Driving automation through effective metadata management
- Collaboration with business units, data scientists, and project/product team
- Participating in ensuring compliance and governance during data use
- Data structure definition, validation, ingestion, processing and visualization of mobile applications
- A bachelor's or master's degree in Computer Science, Statistics, Applied Mathematics, Data Management, Information Systems, information or a related quantitative field
- 7+ years of experience in IT
- 3+ years of experience in Data Engineering
- Experience with Data Governance and Data Management
- Experience with Big Data ingestion, processing and visualization
- Experience with Cloud Native application development
- Experience of working in banks and financial institutions (FinTech experience is a plus)
- Experience with advanced analytics tools for Object-oriented/object function scripting using languages such as Python and Scala.
- Strong ability to design, build and manage data pipelines for data structures encompassing data transformation, data models, schemas, metadata and workload management.
- Strong experience with SQL for RDBMSs such as MS SQL Server and MySQL
- Experience with NoSQL databases like MongoDB and Cassandra
- Strong experience in working with large, heterogeneous datasets in building and optimizing data pipelines, pipeline architectures and integrated datasets using traditional data integration technologies such as ETL/ELT, data replication/CDC, message-oriented data movement, event processing, API design and access.
- Strong experience in working with and optimizing existing ETL processes and data integration and data preparation flows and helping to move them to production.
- Experience in working with Apache Kafka, Apache Nifi, and Apache Spark.
- Experience working with PowerBI for semantic-layer-based data discovery.
- Experience in working with Data Scientists in refining and optimizing Data Science and Machine Learning models and algorithms.
- Experience in working with Data Governance and Data Management practices in moving data pipelines into production with appropriate data quality, governance and security standards and certification.
- Ability to work across multiple deployment environments including cloud, on-premises and hybrid, multiple operating systems and through containerization techniques such as Docker and Kubernetes.
- Adept in agile methodologies and capable of applying DataOps principles to data pipelines to improve the communication, integration, reuse and automation of data flows for the client mobile applications