‹ Back
Senior Data Engineer – Real-Time & Distributed Systems (GCP)
JOB SUMMARY
Job details
Who we are:Innodata (NASDAQ: INOD) is a leading data engineering company. With more than 2,000 customers and operations in 13 cities around the world, we are the AI technology solutions provider-of-choice to 4 out of 5 of the world’s biggest technology companies, as well as leading companies across financial services, insurance, technology, law, and medicine. By combining advanced machine learning and artificial intelligence (ML/AI) technologies, a global workforce of subject matter experts, and a high-security infrastructure, we’re helping usher in the promise of clean and optimized digital data to all industries.
Innodata offers a powerful combination of both digital data solutions and easy-to-use, high-quality platforms.
Our global workforce includes over 3,000 employees in the United States, Canada, United Kingdom, the Philippines, India, Sri Lanka, Israel and Germany.
We’re poised for a period of explosive growth over the next few years. Key ResponsibilitiesDesign, build, and optimize scalable data pipelines for batch and real-time processingDevelop and maintain event-driven architectures for high-throughput systemsEnsure data reliability, performance, and low-latency processing across distributed environmentsCollaborate with data scientists and application teams to enable analytics and AI use casesImplement best practices in performance tuning, monitoring, and cost optimization
Requirements
Advanced proficiency in Python for backend and large-scale data processingStrong experience building and managing big data pipelines in production environmentsHands-on expertise with workflow orchestration tools such as Airflow or Google Cloud ComposerProven experience in batch and streaming data processing using:Apache SparkApache Beam (Dataflow)Experience designing and operating event-driven systems using Pub/SubStrong understanding of distributed systems architecture and scalability patternsExperience managing globally distributed, low-latency datasetsHands-on experience with NoSQL databases and/or Google Cloud SpannerStrong knowledge of system reliability, fault tolerance, and performance optimizationPreferred SkillsProficiency in Go, Java, or ScalaExperience with Kafka or Flume for streaming ingestionDeep familiarity with the Google Cloud Platform eco
System
Experience with production monitoring, logging, and observability frameworksExposure to high-availability, multi-region deploymentsPlease be aware of recruitment scams involving individuals or organizations falsely claiming to represent employers.
Innodata will never ask for payment, banking details, or sensitive personal information during the application process. To learn more on how to recognize job scams, please visit the Federal Trade Commission’s guide at https://consumer. ftc. gov/articles/job-scams.
If you believe you’ve been targeted by a recruitment scam, please report it to Innodata at verifyjoboffer@innodata. com and consider reporting it to the FTC at ReportFraud. ftc. gov.
Innodata offers a powerful combination of both digital data solutions and easy-to-use, high-quality platforms.
Our global workforce includes over 3,000 employees in the United States, Canada, United Kingdom, the Philippines, India, Sri Lanka, Israel and Germany.
We’re poised for a period of explosive growth over the next few years. Key ResponsibilitiesDesign, build, and optimize scalable data pipelines for batch and real-time processingDevelop and maintain event-driven architectures for high-throughput systemsEnsure data reliability, performance, and low-latency processing across distributed environmentsCollaborate with data scientists and application teams to enable analytics and AI use casesImplement best practices in performance tuning, monitoring, and cost optimization
Requirements
Advanced proficiency in Python for backend and large-scale data processingStrong experience building and managing big data pipelines in production environmentsHands-on expertise with workflow orchestration tools such as Airflow or Google Cloud ComposerProven experience in batch and streaming data processing using:Apache SparkApache Beam (Dataflow)Experience designing and operating event-driven systems using Pub/SubStrong understanding of distributed systems architecture and scalability patternsExperience managing globally distributed, low-latency datasetsHands-on experience with NoSQL databases and/or Google Cloud SpannerStrong knowledge of system reliability, fault tolerance, and performance optimizationPreferred SkillsProficiency in Go, Java, or ScalaExperience with Kafka or Flume for streaming ingestionDeep familiarity with the Google Cloud Platform eco
System
Experience with production monitoring, logging, and observability frameworksExposure to high-availability, multi-region deploymentsPlease be aware of recruitment scams involving individuals or organizations falsely claiming to represent employers.
Innodata will never ask for payment, banking details, or sensitive personal information during the application process. To learn more on how to recognize job scams, please visit the Federal Trade Commission’s guide at https://consumer. ftc. gov/articles/job-scams.
If you believe you’ve been targeted by a recruitment scam, please report it to Innodata at verifyjoboffer@innodata. com and consider reporting it to the FTC at ReportFraud. ftc. gov.
Discover the company
Explore other offers from this company or learn more about Innodata.
The company
I
Innodata United States




