About the job
At Kensho, we hire talented people and give them the freedom, support, and resources needed to build cutting edge technology and products for our parent company, S&P Global. As a result, we produce technology that is scalable, robust, and solves the challenges of one of the world’s largest, most successful financial institutions.
We are seeking mid-level experienced Data Engineers to join our Data Engineering Team. The team is responsible for architecting, implementing, and maintaining Kensho’s data ingestion, processing, and storage solutions. They work with stakeholders across Kensho and S&P to map the data landscape, triage data requirements, and design and implement a comprehensive data strategy.
Our ideal candidate has experience with the tasks and skills below:
- Designing, coordinating, and implementing production data platforms using industry standard and/or open source software
- Navigating between technical leadership and individual contributor roles
- Negotiating between new requirements, legacy systems, technical debt, and best practices
- Mentoring colleagues on data engineering best practices and systems design
What You'll Do:
- Work with industry standard and/or open source software such as PostgreSQL, Kafka, Airflow, etc
- Implement and maintain a data management and governance framework
- Support machine learning and application teams by setting up custom data solutions
- Design, maintain, and scale Kensho’s data pipeline and document processing platform
- Build event and batch driven ingestion systems for stand-alone software products, machine learning R&D, and API services
- Develop and administer databases, knowledge bases, and distributed data stores
- Create and use systems to clean, integrate, or fuse datasets to produce data products
- Perform continuous and periodic studies on systems cost efficiency, performance, and overall health
- Establish and monitor data integrity and value through visualization, profiling, and statistical tools
- Implement and maintain a data management and governance framework
What You'll Need:
- Experience with various data-store technologies, distributed messaging platforms, or data processing framework
- Experience designing, architecting and building reliable data pipelines
- Experience working with large structured and unstructured data sets
- Effective coding, documentation, and communication habits
- Proficient understanding of distributed computing principles
- Experience integrating/fusing data from multiple data sources
- Knowledge of various ETL techniques, frameworks, and best practices
- Experience supporting and working with cross-functional teams in a dynamic environment
- (Bonus) Experience with Site Reliability Engineering, DevOps (CICD), and Cloud Administration