87: Michael Katz: The Evolution of packaged CDPs, democratizing ML and the myths of composable and zero data copy

What’s folks, today I’m pumped to be joined by Michael Katz, CEO and co-founder at mParticle, the leading packaged Customer Data Platform.

Summary: In the contentious debate over Packaged and Composable CDPs, Michael delivers a clear-eyed perspective that cuts through the hype. Rejecting the idea that Packaged CDPs are becoming obsolete, he emphasizes the continued importance of data quality, integrity, and privacy, and he warns against becoming entangled in marketing illusions. He also highlights the need for adaptability, dismissing some of the more pervasive myths in the martech landscape, such as the magic of zero data copy. With strategic acquisitions, mParticle is focusing on intelligence and automation, aiming to be more than just “simple pipes” in data management. Michael’s insights provide a grounded roadmap, focusing on genuine value creation and thoughtful navigation of the complex industry that is Customer Data Platforms.

Jump to a section

About Michael

  • Michael got his start as an analyst at Accenture and later focused on customer acquisition and marketing strategy for a mobile content company
  • He entered the entrepreneurial world founding interclick in 2005, a data-valuation platform for advertisers
  • He ran the company as President and took the company public in 2009 and sold to Yahoo in 2011 for $270M 
  • He’s been on the Board of Directors for several companies including Adaptly and BrightLine
  • He’s a volunteer at Southampton Animal Shelter
  • He’s also a Mentor at Techstars
  • After a year as VP of Optimization and Analytics at Yahoo after his company’s acquisition, Michael took on his second venture, co-founding mParticle in 2013
  • mParticle is a global, remote-first company that provides a real-time AI customer data platform.  They work with big players and small, fueling the customer success of brands like Paypal, Seatgeek, Venmo, Headspace, Lyft, McDonalds, and Airbnb.

Unpacking the 8 Components of Customer Data Platforms

When asked about Arpit Choudhury’s enumeration of the eight essential components of Customer Data Platforms (CDPs), Michael’s response was swift and assertive. With an appreciative shoutout to Arpit for articulating the complex aspects of CDPs, he aligned himself with the eight facets laid out in the question.

These eight components, according to Michael, indeed compose an end-to-end solution for the first generation of CDPs.  They include:

  1. CDI, customer data infra, collect 1st party event data from customers from website and apps
  2. ETL, data ingestion, extract data from other tools and load it into DWH
  3. Data Storage/warehousing, store a copy of data collected
  4. Identity resolution, a solution for tying together a customer’s various interactions with you across multiple platforms and devices
  5. Audience segmentation, drag and drop UI
  6. Reverse ETL, extract/activate from DWH to other tools
  7. Data quality, validity, accuracy, consistency, freshness, completeness… 
  8. Data governance and privacy compliance, user consent, HIPAA compliance
source: https://databeats.community/p/composable-cdp-vs-packaged-cdp-components

Emphasizing the integrated nature of these components, Michael asserts that the value of the whole system is greater than the sum of the individual parts. He proudly reflects on mParticle’s reputation as a complete CDP and emphasizes that many existing CDPs lack strong stories around data quality and governance.

The conversation with Michael reveals his confidence in the synergy that arises when these parts function together. He cautions against skipping any of these steps, underscoring that a weak foundation will undermine the entire system. Speed in data processing should not compromise quality and privacy protection, and mParticle’s holistic approach ensures this balance is maintained.

Takeaway: Michael’s insights into the eight essential components of CDPs not only align with industry experts but also highlight the importance of a unified approach. By valuing integration, quality, and consumer privacy, mParticle positions itself as a leading player in the CDP landscape. The wisdom shared by Michael emphasizes that genuine value is derived not merely from the individual elements but from the careful orchestration of all parts into a coherent and resilient system.

Debunking the Myths Around Reverse ETL and Composable CDPs

Reverse ETL and composable CDP proponents assert that the traditional CDP is becoming obsolete and that the future lies in Composable CDPs that leverage modern data warehouses and processes like Reverse ETL. Claiming that existing CDP vendors will have to adapt to this shift or risk becoming irrelevant.

Michael’s written extensively about this debate over the years. He argued that product marketing around the composable CDP is just modern day sleight of hand tricks…designed to dupe the buyer. 

To be fair, mParticle has adapted to the rise of the modern data stack by offering services like data warehouse sync and value-based pricing. 

Michael highlighted the rise of the Cloud Data Warehouse as an essential system within organizations, but he was quick to emphasize that the real challenges lie in maintaining data quality, integrity, and privacy. As he elaborated, legacy CDP vendors like mParticle deliver value not in the storage of data, but in the movement and activation of it. Michael stressed the importance of going beyond mere data collection to understanding the context and the “why” behind customer behavior.

According to Michael, the true value in the CDP space has shifted towards enhancing context, improving understanding, and introducing an insights layer. For mParticle, this has translated into a focus on finding truth and meaning in their data, creating an infinitely optimizing loop. He vehemently argued against reverse ETL, characterizing it as “garbage in, garbage out,” and took aim at what he described as “sleight of hand” tricks in product marketing designed to distract from the real issues.

Michael challenged several narratives in the debate, dismissing the importance of zero data copy, the vulnerability of CDPs to security threats, and the notion of faster deployment times leading to sustained value. He warned against getting enticed by aggressive product marketing, stressing that what might appear easy to implement could be hard to maintain.

Composable claimsMichael’s response
Faster time-to-value: Activating the existing data in the warehouse is faster than implementing a traditional CDP, which can often take 6 months or more.The claim that CDPs are slow to implement is baseless. With robust and mature platforms, CDPs can assure quick deployment and high data quality.
Cost savings: CDPs result in duplicate data storage costs and management“Zero data copy” is unachievable. Instead, companies should focus on minimizing data duplication and boosting data transfer efficiency. But you added that the question shouldn’t be “why am I paying for another copy of my data?” but rather… “Which architecture delivers the most value at the cheapest cost?”
Privacy and security: Off-the-shelf CDPs replicate sensitive customer data into their data store, creating additional privacy and security risks.Security risks aren’t unique to CDPs. They depend on the software vendor’s commitment to security measures, and not on the software category.

Takeaway: The transformation of CDPs isn’t just about new technologies or marketing tactics but lies in understanding the true needs of customers. With a focus on integrity, context, and sustained value, Michael exposes the fallacies in current debates, emphasizing that real success comes from creating genuine value, not just noise.

The Realities of Replacing Traditional CDPs with Reverse ETL Tools

When asked about the growing trend where some reverse ETL customers have found ways to replace their traditional Customer Data Platforms (CDP) with reverse ETL tools, Michael acknowledged that this represents only a very narrow subsegment of the market. He expressed a concern that the fragmented “Do It Yourself” approach isn’t always a practical solution, particularly for most businesses within the enterprise sector.

Michael pointed out that during the pandemic, certain habits had developed, often driven by data engineers working with limited perspectives and without a comprehensive understanding of the complexities of running successful digital marketing campaigns. This lack of integration and understanding has led to an increasing need for a return of the decision-making power to the marketers.

Highlighting the importance of usability, Michael described how mParticle is designed to make it easy for marketers to contextualize and activate data in a low code, no code manner. This approach stands in contrast to other CDPs and modern data stack tools that require intricate knowledge of SQL scripts and schema. 

A significant portion of his argument revolved around the practical challenges of troubleshooting across multiple different systems. He explained that when a business relies on eight or more different systems to serve the purpose of an end-to-end CDP, it introduces a unique set of complexities. If something goes wrong, troubleshooting becomes an intricate web of challenges involving different account managers. In Michael’s words, “the whole thing becomes a bit of a mess.”

Takeaway: Michael’s insight sheds light on the realities of replacing a traditional CDP with reverse ETL tools. The fragmented approach may work for some but presents complexities and challenges that might be impractical for the broader market. Usability, integration, and streamlined workflows are highlighted as essential elements for optimizing business value, suggesting that while there are different paths to success, a straight line is often the fastest and most efficient route. The emphasis on integration over “hobbyist” solutions presents a compelling argument for businesses looking to evolve in the ever-changing landscape of martech.

Unraveling the Myth of Zero Data Copy in Martech

When Michael was asked about the notion of zero data copy, he didn’t mince words, immediately cutting through the hype to lay bare the underlying realities. He expressed skepticism about the idea that zero data copy is a magical solution, pointing to the assumption that copying data creates inefficiency and additional access cost.

Michael argued that the cost of storage isn’t the main driver of expenses; it’s the cost of compute. He believes that creating duplicate copies of data doesn’t drastically change costs and, moreover, that there’s considerable efficiency to be gained by replicating data for different uses and use cases.

He also emphasized the importance of focusing on the value side of the equation. Minimizing costs is essential to maximizing investable resources for growth, but it shouldn’t overshadow the primary goal of driving customer value. Michael expressed concern that focusing on zero data copy might lead businesses down the wrong path, solving for a non-existent problem.

His perspective on the issue extended to a critique of some reverse ETL companies. He noted that they often face a churn problem, luring customers in with the promise of an “easy button” only to disappoint when reality doesn’t meet expectations.

Takeaway: Michael’s dismantling of the zero data copy concept offers a vital reminder that not all that glitters is gold in the world of martech. By focusing on the practicalities of costs and the importance of efficiency and value, he encourages businesses to ask the right questions and prioritize what truly matters. His argument against zero data copy serves as a caution against getting swept up in appealing but potentially misguided solutions, emphasizing instead a thoughtful approach to data management that delivers real value.

Examining the Warehouse Native Approach to Martech

When Michael was asked about the increasing trend of warehouse native approaches in martech and its potential impact on companies with large volumes of non-revenue-generating users, his response was insightful. He broke down the question into specific elements, focusing on both the technological and practical aspects of this approach.

He acknowledged the structure of a typical marketing tech stack, with various components like analytics, customer engagement platforms, experimentation tools, and customer support services. However, he questioned the real beneficiaries of having all these tools built natively on the Cloud Data Warehouse. He emphasized that the benefit might lie more with the data warehouse provider than with the customer.

Michael also pointed out that as different vendors leverage multiple datasets and run their own compute cycles on the data warehouse, it’s not necessarily clear if that would result in cost savings. He challenged the assumption that avoiding multiple copies of data would inherently save money, stating that there hasn’t been enough side-by-side comparison to substantiate this belief.

He concluded that whether it’s through a company like Snowflake or mParticle, they are, in essence, reselling cloud compute in different forms. Simply assuming cost savings because of a lack of data duplication might not hold true in practical terms.

Takeaway: Michael’s analysis of the warehouse native approach in martech opens a nuanced conversation about the real-world implications of this trend. By examining who benefits from this strategy and challenging the common assumption that it leads to cost savings, he encourages a more critical evaluation. The discussion underscores that what might appear as an intuitive solution needs more robust evidence and careful consideration to understand its true value and impact.

📫 Never miss an episode or key takeaway 💡

By subscribing to our newsletter we’ll only send you an email when we drop a new episode, usually on Tuesday mornings ☕️ and we’ll give you a summary and key takeaways.

Success! You're on the list.

The Insights Layer of mParticle’s Approach to Customer Data

It’s getting harder and harder to track the packaged vs composable battle these days, there’s a ton of overlap with so many tools:

  • ETL tools adding rETL features while rETL tools and CDIs becoming composable CDPs
  • CDPs adding product analytics and AI features while product analytic tools adding CDP and AI features
  • CDPs adding marketing automation features while MAPs adding CDP features
  • CDPs also adding “warehouse connectors” or “warehouse sync” 

Adding an interesting layer to the debate here is extending the capabilities of the CDP into new areas. mParticle made some interesting acquisitions over the last few years:

  • Aug 2022 Vidora, AI personalization platform for customer data
  • Jan 2022 Indicative, a customer journey analytics platform to address data entropy

With these capabilities, mParticle is adding an intelligence layer that not many CDPs have. Not only are they capturing and helping customers move data around, they’re helping them make sense of the data, look back to see what happened and also make predictions on what will happen.

Initially, mParticle’s efforts were directed at solving mobile data collection challenges, aiming to set up organizations on a durable and scalable API-based system. By addressing unique mobile data challenges that no one else was confronting, they sought to position themselves at the center of mass for many consumer brands.

According to Michael, the solution to these challenges led to mParticle’s focus on multi-channel data challenges, revolving around vital components like data quality, governance, and identity resolution. Identity resolution, Michael believes, remains one of the most misunderstood aspects of the whole process.

But the vision didn’t stop there. The evolution went beyond these challenges, aiming at what would come next: intelligence and automation. The acquisitions of Vidora and Indicative, as Michael revealed, probably accelerated mParticle’s roadmap by four or five years.

Michael brought to light mParticle’s ambitious strategy to move beyond mere segmentation tools and “simple pipes.” As Michael argued, many existing tools are like “simple pipes” that do exactly what you tell them to do. However, mParticle’s approach aims to be an intelligent force that moves the industry forward.

Michael’s discourse paints a picture of a company that’s not just satisfied with optimizing first-generation capabilities. It’s a story of looking ahead, focusing on intelligent pipes and striving to put customers in the best possible position to extract value from their first-party customer data.

Takeaway: By focusing on next-generation capabilities and accelerating their roadmap through strategic acquisitions, mParticle is positioning itself as a leading force in the evolving landscape of martech. The compelling insight is their move towards intelligent pipes that can make sense of the data, not just move it around, guiding the industry into a new era of customer data understanding and utilization.

The Vidora Acquisition: Empowering Marketers with Machine Learning

When asked about the acquisition of Vidora and its integration into mParticle’s CDP offering, Michael dove into the compelling dynamics behind this strategic move. The conversation revolved around AI tools like IBM’s Watson Studio, Amazon SageMaker, and Google’s AutoML, which are generally built for data scientists. What set Vidora apart, however, was its design to be accessible to knowledge workers and marketers, aligning with the founders’ vision to democratize machine learning.

Michael was keen to clarify that many tools in the market offer a single type of machine learning, often centered around propensity scores. But Vidora went beyond, impressing him with the building of diverse ML pipelines. The suite enabled regression testing, propensity scoring, uplift analysis, and more, without constraining the types of intelligence or automation that customers could access.

According to Michael, the uniqueness of customer data demands tailored solutions, as no two customers’ data look, shape, or behave the same way. With Vidora, now branded as Cortex, mParticle has extended a full suite to users that align with various channels. The seamless integration of models within mParticle allows marketers to create, sync, and activate models effortlessly, accommodating different channels from paid advertising to customer support.

But what really resonated with Michael’s view was how this acquisition tackled a common industry problem: the gap between the creation of cool models and their actual implementation into production. Most in-house models never see the light of day, and those that do are often channel-specific, failing to transcend their original context. Cortex, on the other hand, offers flexibility without channel dependency, backed by mParticle’s robust and diverse set of connectors.

Takeaway: mParticle’s acquisition of Vidora, rebranded as Cortex, has redefined the machine learning landscape for marketers. It provides a versatile and accessible set of tools that break down conventional barriers and facilitate the practical application of models across diverse channels. By doing so, it empowers marketers to extract greater value from data and paves the way for a more intelligent and integrated approach to customer engagement.

Innovating Martech Pricing: A Fresh Approach to Value-Based Pricing

When asked about the recent shift in mParticle’s pricing structure, Michael delves into the exciting philosophy behind this change. He emphatically expresses that the change isn’t merely superficial, but rather a product of innovation, something that’s more than just a re-packaging of their pricing model. 

Michael explains the need for de-averaging or de-aggregating pricing, acknowledging that the traditional charging based on users or events is fairly straightforward, but it doesn’t capture the full picture. According to Michael, not all events, users, or use cases hold equal value, and treating them as such creates a logjam through the system. This one-size-fits-all approach undermines the ability to provide marketers with appropriate solutions.

The heart of the problem is that this logjam prevents Customer Data Platforms (CDP) from having access to all necessary data, typically due to how they are priced. Michael highlights that when they analyzed how customers were using mParticle, they discovered three distinct use cases: real-time event federation, data maintenance for historical lookup and redundancy, and targeting and personalization.

With this fresh approach, mParticle managed to “unclog the pipes” of data, allowing it to flow where needed and at the right pace. This shift allowed for acceleration in audience calculation and refresh, and extended the look-back window on real-time audiences from a mere 90 days to perpetuity without sacrificing performance.

Takeaway: Michael’s insights into mParticle’s new pricing structure reveal an innovative and necessary departure from traditional user or event-based pricing. By recognizing the unique value in different data points and use cases, mParticle has managed to not only create a more effective pricing model but also to enhance the functionality and efficiency of their platform. It’s a lesson in understanding the complex dynamics of the martech space and the importance of aligning pricing models with actual value and functionality.

Empowering Black Founders with Technology

One of the coolest discoveries when digging through Michael’s socials is that he actually created Tech for Black Founders. He got together with a list of data vendors to provide free software to early-stage startups led by Black founders, as part of an initiative to support Black technologists and entrepreneurs, who currently make up only 1% of founders backed by venture capital in the US. 

In the midst of 2020, during a peak of social unrest, he found himself pondering how his company, mParticle, could serve the community better. It was more than a fleeting thought; it was a shower epiphany that would soon spark a wave of empowerment for black technologists and entrepreneurs.

Michael’s initiative, which might seem simple, was profound. Recognizing that black founders made up less than 1% of those backed by venture capital in the U.S., he set out to make a difference. The idea was to provide free software from leading tech companies to early-stage, black-led start-ups. The aim was to bridge the equity gap, offering services usually costing six to seven figures to those underrepresented.

He texted friends and fellow founders from braze, amplitude, branch, and more. His proposal was met with instant approval, and a simple application page was launched. What happened next was nothing short of extraordinary. The initiative went viral, with 50 to 100 companies reaching out, eager to contribute, and the movement continues to grow, now encompassing hundreds of companies offering their services to black and other minority tech founders.

Takeaway: Michael’s leadership in rallying tech companies to offer free software to black and minority tech founders is a powerful example of how one person’s idea can ignite a movement. It underscores the importance of community and collaboration, and showcases a tangible effort to close the equity gap in the tech industry. Simple, immediate, and impactful, it’s a testament to what can be achieved when passion meets purpose.

Finding Balance and Joy in a Multifaceted Life

When asked about how he remains happy and successful amidst his diverse roles as a founder, writer, sports fanatic, dad, animal shelter volunteer, mentor, and board member, Michael’s response is a reflection of self-awareness, clarity, and wisdom. His take on balancing a life filled with various passions and responsibilities is both refreshing and deeply inspiring.

First and foremost, Michael’s priority is being a dad, a role he deems his most important job. Everything else, whether it’s being a CEO or a board member, follows in sequence. He admits that although he doesn’t always follow his own advice, the goal isn’t merely about becoming proficient at navigating the ups and downs of company building and the entrepreneurial journey. Instead, it’s about transcending these fluctuations and reaching a state of equanimity.

Michael stresses that the pursuit isn’t happiness itself; rather, the pursuit is happiness. Finding joy, meaning, and growth in whatever he’s doing is what keeps him motivated and content. He measures his alignment with his work by his excitement every Monday morning and his anxiety every Friday for not getting enough done. If those feelings begin to reverse, that’s his cue to reassess his path.

Takeaway: Michael’s philosophy on balance and happiness is a profound lesson in understanding one’s priorities and embracing the journey itself as the source of joy. His words are a reminder to find contentment in the pursuit, to align passions with purpose, and to recognize the importance of self-awareness in living a fulfilling life. His perspective turns the conventional wisdom of “work-life balance” on its head, offering a unique insight into living a life filled with meaning and happiness.

Michael Teases Exciting Announcements from mParticle

When asked if there was anything he wanted to share with the audience or any exciting things launching soon, Michael’s response was filled with enthusiasm and intrigue. He hinted at some compelling announcements coming from mParticle in September. Without divulging specific details, he provided a glimpse into what the company is focusing on.

Michael mentioned that these new developments would continue to expand on their mission of creating value. They are looking to transpose their services and add value not just in their own data store but across any data store, including the data warehouse ecosystem. Though he kept the specifics under wraps, the anticipation in his voice was clear. The audience was left eagerly awaiting the “cool shit” that mParticle has in store.

Episode Recap

The martech industry is no stranger to bold claims and sweeping predictions, and the recent debate around Reverse ETL and Composable CDPs is no exception. The air is thick with assertions that traditional CDPs are going the way of the dinosaur, set to be replaced by sleek, modern solutions. Michael, however, has a more grounded take.

For starters, he considers the buzz around Composable CDPs to be a well-executed marketing illusion, a sleight of hand rather than a genuine revolution. Sure, modern data warehouses and Reverse ETL processes are capturing attention, but at the core, the need for data quality, integrity, and privacy still reigns supreme. Michael doesn’t view this shift as a death blow to existing CDP vendors like mParticle, but rather a call to adapt, focusing on the movement and activation of data.

Adaptation is a theme that resonates throughout Michael’s insights. While acknowledging that some Reverse ETL customers are indeed replacing traditional Customer Data Platforms, he emphasizes that this trend represents a narrow slice of the market. The fragmented “Do It Yourself” approach has its limitations, especially when applied to the complex landscape of enterprise-level marketing. Here, mParticle’s approach stands out, prioritizing usability and enabling marketers to contextualize and activate data without becoming entangled in intricate coding.

Michael doesn’t shy away from debunking popular narratives in the debate, including the myth of zero data copy. Cutting through the hype, he directs attention to the real drivers of expenses and underscores the importance of focusing on customer value over cost-cutting.

Perhaps the most intriguing aspect of Michael’s perspective lies in the strategic evolution of mParticle. The company’s recent acquisitions, including Vidora, an AI personalization platform, signal a commitment to intelligence and automation. Moving beyond simple data collection and segmentation, mParticle aims to become an intelligent force that drives the industry forward. Their tools aren’t mere “simple pipes”; they’re designed to meet the unique needs of customers and provide tailored solutions that enhance understanding and value extraction.

All and all, Michael offers a refreshingly realistic and actionable perspective on the current CDP landscape. Rather than getting caught up in marketing tricks or chasing after the latest shiny object, he encourages a return to core principles and a commitment to intelligent, adaptable solutions. It’s an approach that recognizes the complexity of the industry while providing clear pathways for growth, innovation, and genuine value creation.

Whether you’re a marketer, data engineer, or business leader, listen below for insights that offer a solid foundation for navigating the ever-complex world of martech and data platforms, without falling prey to illusions or unnecessary complexity. 🎧

Follow MK and mParticle 👇


Intro music by Wowa via Unminus
Cover art created with Midjourney

See all episodes

📫 Never miss an episode or key takeaway 💡

By subscribing to our newsletter we’ll only send you an email when we drop a new episode, usually on Tuesday mornings ☕️ and we’ll give you a summary and key takeaways.

Success! You're on the list.

Future-proofing the humans behind the tech

Leave a Reply