Tech News - Eran Stiller

Pinterest's Moka Data Processing Platform - High Level Architecture Diagram

From Hadoop to Kubernetes: Pinterest’s Scalable Spark Architecture on AWS EKS

July 29, 2025July 29, 2025

Pinterest revamped its data infrastructure by transitioning from a legacy Hadoop system to the Moka platform, leveraging Kubernetes and Spark on AWS EKS. This strategic shift enhances job isolation, simplifies deployment, and optimizes resource management, leading to reduced costs and improved efficiency. Read more on InfoQ

Illustration of new segments in Northguard getting added to the range and get assigned to potentially new brokers

LinkedIn Announces Northguard and Xinfra: Scaling Beyond Kafka for Log Storage and Pub/Sub

June 26, 2025June 26, 2025

LinkedIn today announced Northguard, a scalable log storage system that replaces Kafka, and Xinfra, a virtualized Pub/Sub layer. Northguard delivers sharded data & metadata, log striping, strong consistency, and self-balancing clusters at a larger scale than Kafka, while Xinfra enables seamless migration and unified access across Kafka and Northguard. Read more on InfoQ

High level architecture diagram placing the Apollo MCP server in front of existing GraphQL APIs and consumed by AI agents

Apollo GraphQL Launches MCP Server: A New Gateway Between AI Agents and Enterprise APIs

May 28, 2025May 29, 2025

Apollo GraphQL recently launched its MCP Server, enabling businesses to securely and efficiently integrate AI agents with existing APIs using GraphQL. The platform empowers teams to scale innovation and drive faster time-to-value from AI investments by reducing development overhead, improving governance, and accelerating AI feature delivery. Read more on InfoQ

Uber's document processing pipeline, from the incoming PDF document, through image conversion, ORC, ML Model processing and up to GenAI response

Scaling Financial Operations: Uber’s GenAI-Powered Approach to Invoice Automation

May 1, 2025May 1, 2025

Uber recently described a GenAI-powered invoice processing system that reduced manual effort by 2x, cut handling time by 70%, and delivered 25–30% cost savings. By leveraging GPT-4 and a modular platform called TextSense, Uber improved data accuracy by 90%, enabling globally scalable, efficient, and highly automated financial operations. Read more on InfoQ

An illustration of multiple agents communicating with each other over Dapr

Dapr Agents: Scalable AI Workflows with LLMs, Kubernetes & Multi-Agent Coordination

March 20, 2025March 20, 2025

Introducing Dapr Agents—a groundbreaking framework for creating scalable AI agents using Large Language Models (LLMs). With robust workflows, multi-agent coordination, and cloud-neutral architecture, it enables enterprises to deploy thousands of resilient agents. Built on Dapr’s proven infrastructure, Dapr Agents ensures reliability and observability in AI-driven applications. Read more on InfoQ

High-level architecture of Monzo Stand-in

How Monzo Bank Built a Cost-Effective, Unorthodox Backup System to Ensure Resilient Banking

February 24, 2025February 22, 2025

Monzo Bank recently revealed Monzo Stand-in, an independent backup system on GCP that ensures essential banking services remain operational during application and AWS infrastructure outages. Unlike traditional replicated backups, it’s a minimal stand-alone system that exclusively supports key operations and features a cost-effective design, resulting in 1% of the operational costs of the primary deployment….

A software architecture diagram of Atlassian Lithium

Inside Atlassian Lithium: How a Dynamic ETL Platform Is Transforming Data Movement and Cutting Costs

January 30, 2025January 30, 2025

Atlassian recently introduced Lithium, an in-house ETL platform designed to meet the requirements of dynamic data movement. Lithium streamlines tasks such as cloud migrations, scheduled backups, and in-flight data validations by supporting ephemeral pipelines and tenant-level isolation while ensuring efficiency and scalability, resulting in significant cost savings. Read more on InfoQ

A simplified sequence diagram showcasing Netflix's global counter operation

Inside Netflix’s Distributed Counter: Scalable, Accurate, and Real-Time Counting at Global Scale

December 10, 2024December 10, 2024

Netflix engineers recently published a deep dive into their Distributed Counter Abstraction, a scalable service designed to track user interactions, feature usage, and business performance metrics with low latency globally. Built atop Netflix’s TimeSeries Abstraction, the system balances performance, accuracy, and cost through configurable counting modes, resilient data aggregation, and a globally distributed architecture. Read…

A diagram showing a monolith with two GraphQL APIs is exchanged with two smaller services, each exposing a single API.

Agoda’s Unconventional Client-First Transition from a GraphQL Monolith to Microservices

November 26, 2024November 25, 2024

Agoda recently described their unconventional approach to transitioning from a monolithic GraphQL API to a microservices architecture. Unlike traditional methods focusing on breaking down server-side components first, Agoda adopted a client-first strategy, preparing their client applications to handle both the monolith and the microservices in parallel using an in-house smart orchestrator library. Read more on…

Uber Genie's high-level architecture diagram

RAG-Powered Copilot Saves Uber 13,000 Engineering Hours

October 29, 2024November 25, 2024

Uber recently detailed how it built Genie, an AI-powered on-call copilot designed to improve the efficiency of on-call support engineers. Genie leverages Retrieval-Augmented Generation (RAG) to provide accurate real-time responses and significantly enhance the speed and effectiveness of incident response. Read more on InfoQ

A very high level architecture diagram depicting the transition from a backend-based solution to WebRTC

How Canva Scaled Real-Time Collaboration with WebRTC: From WebSockets to Seamless P2P Communication

September 25, 2024September 24, 2024

Canva recently shared how it implemented real-time mouse pointers for collaborative whiteboarding. Canva chose a WebRTC-based solution to improve scalability, reduce latency, and lower backend load. Since WebRTC uses peer-to-peer communication, Canva can provide users with a smoother, more performant real-time experience than a traditional backend-based WebSocket and Redis solution. Read more on InfoQ

A conceptual diagram showing the main components of the Lyft IoT platform: State management, Provisioning & Authentication. and Communication & Control.

Inside Lyft’s Glow: How IoT Architecture Is Driving Smarter Ride Experiences

August 20, 2024August 17, 2024

Lyft recently published how it built the Glow emblem, its newest Internet-of-Things (IoT) device. The Glow is actively rolling out in markets across the US, with over 30,000 live devices. Its architecture addresses many challenges in previous iterations, including a unified IoT middleware framework, robust provisioning and authentication mechanisms and advanced device control. Read more on…

Follow Me On Social