Big Data Engineer
Role Overview
We are looking for a powerhouse Big Data Engineer to architect and manage our global data infrastructure. You will be responsible for handling petabyte-scale datasets, ensuring they are structured for both lightning-fast real-time ingestion and deep historical analytics. Your primary mission is to unify our distributed private cloud data lakes—stretching across our nodes in the US, Europe, and Budapest—into one orderly, high-performance ecosystem. This role is central to our ability to process massive market data feeds and provide our quantitative teams with the raw power they need to win.
Location:
Budapest, 1054
Key Responsibilities
- Data Lake Orchestration: Build and interconnect distributed private cloud data lakes, transforming disparate data silos into a unified, orderly structure.
- Real-Time Ingestion: Design and maintain robust pipelines for real-time data "injection" and streaming analytics, ensuring zero-loss capture of high-velocity market feeds.
- High-Performance Interfaces: Develop and optimize low-latency interfaces with external data providers, ensuring reliable data flow under extreme load.
- Database Architecture: Manage and tune petabyte-scale databases, selecting the appropriate structures for storage and retrieval efficiency.
- Pipeline Automation: Implement automated ETL/ELT processes that prioritize data integrity and performance, utilizing modern columnar storage formats.
Minimum Qualifications
- Database Mastery: Expert-level knowledge of SQL and high-performance columnar DBMS. You must have hands-on experience with tools like ClickHouse, BigQuery, or DuckDB.
- Storage Formats: Deep understanding of optimized file formats, specifically Parquet, Avro, or ORC, and how to leverage them for analytical performance.
- Programming Excellence: Strong software engineering fundamentals are a must. Proficiency in Python, Go, Java, or Scala for building data tools and pipelines.
- Distributed Systems: Experience managing data across distributed environments and understanding the complexities of data consistency and network latency.
- Data Engineering Ecosystem: Proficiency with tools like Spark, Flink, or Kafka for stream and batch processing.
Preferred Qualifications
- Capital Markets Expertise: Previous experience in High-Frequency Trading (HFT) or a deep understanding of market data structures (Level 2/3 feeds).
- kdb+/Q Experience: Experience with kdb+ and the q programming language for time-series data is a massive advantage.
- Cloud Infrastructure: Experience with Hybrid Cloud environments, specifically optimizing data egress/ingress costs and performance.
- HPC Knowledge: Familiarity with high-performance computing clusters and hardware-accelerated data processing.
Benefits
- Competitive Salary: A top-tier compensation package tailored to your experience and the global market.
- Ownership Stake: Employee Share Options (ESOP), allowing you to directly benefit from the company’s long-term growth and success.
- Professional Growth: Access to International Conferences (US, Europe, and beyond) to stay at the forefront of your field.
- Career Development: Structured Career Building Plans with clear milestones and the support needed to transition into leadership or specialized roles.
- Premium Mobility: A Company Car provided as part of your compensation package (subject to role/seniority level).
- Daily Perks: Complimentary lunch, snacks, and premium coffee to keep you fueled throughout the day.
- International Environment: The opportunity to work in a truly global setting with sister companies in the US and EU.