In conversation with Richie Artoul, co-founder of WarpStream Labs

Kshitij Grover

In this episode of Tractable, Orb's podcast with engineering leaders, Kshitij Grover dives deep on WarpStream. WarpStream is a cost-effective Kafka-compatible streaming platform built to be simpler to manage than traditional Kafka solutions, built on top of Richie's previous experiences managing large-scale data systems at both Datadog and Uber. WarpStream is not only architected to save companies inter-AZ costs but also to provide them a stateless system that's easier to manage with often-acceptable tradeoffs on end to end latency. Richie talks about how WarpStream being in the critical path for other companies' infrastructure influences the GTM strategy and the architecture: serverless pricing is an important mechanic to make WarpStream a foundational building block, and it influences how the team thinks about reliability as a mission-critical part of the offering.

Kshitij Grover: [00:00:00] Hey everyone, welcome to another episode of The Tractable Podcast. I'm Kshitij, co-founder and CTO here at Orb. Today I have with me Richie. Richie is the co-founder at WarpStream. WarpStream is a Kafka compatible data streaming platform, and WarpStream is doing a ton of cool stuff on top of S3. And Richie has worked at Datadog and Uber before working on lots of big data streaming problems. So, really excited to have Richie. Richie, welcome to the show.

Richie Artoul: Thanks for having me, man. Really excited to be here.

Kshitij Grover: Awesome. Well, let's just start a little bit with your background and what inspired you to start WarpStream. As I just said, you've clearly worked in parts of this technology at previous companies.

So give me a little bit of that play-by-play and how, you've thought about this problem perhaps in previous roles in your career.

Richie Artoul: Yeah, you know, I've been working in distributed storage for most of my professional career at this point. So like nine years or something, kind of got [00:01:00] started almost all focused in the observability space, too.

So I kind of got started in storage at Uber working on, you know, when Uber was going through their kind of like hyper growth phase, they had a bunch of internal observability technology. I mean, they still do. And there was an internal metrics platform called M3, that I worked on, that technology, you know, it was open source and eventually became the company that is Chronosphere. And so they have a, they had a distributed open source time series metrics engine and distributed time series aggregation thing. And so I worked on that for about three years, you know, mostly on the kind of guts and internals of the database. That was a really different system than WarpStream, too..

That was very much like a, you know, it does all the replication itself. It interacts with raw disks, classic distributed system. Interestingly enough, when we were at Uber, the observability teams actually couldn't like afford to use Kafka, like the whole company used Kafka, [00:02:00] but it just like the amount of data we were pumping through, we just, it wasn't cost effective.

And so, at least at the time, the M3 team created this thing called M3 message, which is you can think of like just the in memory part of Kafka, you know, if the buffers fill up cause there's too much back pressure, then you have to drop metrics and that's life. So that was kind of like my first kind of encounter with streaming is for these kinds of like big data systems.

Full transcript here.

February 23, 2024

Ready to solve billing?

Contact us to learn how you can revamp your billing infrastructure today.

Let's talk.

Thank you! We'll be in touch shortly.
Oops! Something went wrong while submitting the form.