Talk track: Welcome back from lunch. This is the most technical block of the course, but I want to be clear — it's technical in the way a manager needs to be technical. You don't need to configure a Snowflake instance. You need to know what questions to ask, what trade-offs exist, and how to work with IT, Legal, and Engineering to make infrastructure decisions that your team can live with for the next two years. By the end of this session, you'll have two artifacts: a data infrastructure blueprint and a vendor scoring matrix. Both anchored to your case context. Let's go.
Talk track: Before we dive into specific tools and frameworks, I want to make the case for why you — as a manager, not an engineer — need to care about infrastructure. These are not reversible decisions. When you pick a data warehouse, you're committing your team's workflows, your SQL dialect, your cost structure, and your integration points for the next two to three years. Migrations are brutal. I've seen teams spend six months moving from Redshift to Snowflake — six months where they built almost nothing new. The wrong BI tool means your stakeholders can't self-serve, so every question lands on your team's plate. The wrong ingestion pipeline means data arrives late, incomplete, or wrong.
Talk track: Your analysts can only analyze what they can access. If the infrastructure is broken, their skills don't matter. The manager's job is not to configure the tools — it's to make the right choices about which tools, how much to invest, and who owns what. Getting this wrong creates technical debt that compounds. The next manager inherits your infrastructure decisions, good and bad. So take these decisions seriously.
Talk track: Here's a number that surprises most new analytics managers: you will spend thirty to forty percent of your time on cross-functional work. Not analyzing data. Not building dashboards. Managing relationships with other teams. You depend on Engineering to instrument events and maintain schemas. You depend on IT to provision access and approve vendors. You depend on Legal to clear your data usage. You depend on Finance to approve your budget. You depend on Product to tell you what to measure.
Talk track: If any one of these relationships is broken — if Engineering doesn't tell you about schema changes, if Legal takes six weeks to review a vendor — your team sits idle. Most managers handle this ad hoc: a Slack message here, a coffee chat there. That doesn't scale. This block gives you the structural tools — SLAs, blueprints, scoring matrices — to manage these dependencies like a professional. Handshake agreements don't survive re-orgs. Written SLAs do. That's the key difference between reactive management and proactive management.
Talk track: Let me preview what we're building so you know where we're headed. Four artifacts. First, cross-functional SLAs — we'll draft actual written agreements between analytics and the teams you depend on. Second, a data infrastructure blueprint — you'll map out the tools for each layer of your data stack, specific to your case context.
Talk track: Third, a build-vs-buy analysis — a decision framework you'll apply to at least one real tool choice. Fourth, a vendor scoring matrix — you'll evaluate three real vendors on weighted criteria and make a recommendation. I want to be explicit about the goal here: these are not slides you forget after class. These are documents you can bring to a job and use immediately. The infrastructure blueprint is something you could hand to a CTO in your first month. The vendor scoring matrix is something you could use the first time someone asks you to pick a BI tool. Let's get into it.
Talk track: Let's start with context. In Block A, you built a stakeholder map. Now we're going to zoom in on the cross-functional relationships that matter most for infrastructure and data decisions. Analytics sits at the center of this universe — not because we're the most important team, but because data touches everything. Product needs you to measure their features. Engineering needs you to not break their systems with heavy queries. Design needs you to surface insights in the product.
Talk track: Legal needs you to not get the company sued. Finance needs to understand what this all costs. And IT needs to make sure it's secure and compliant. Here's the mental model I want you to carry forward. Analytics is the hub. Every other team is a spoke. You don't control any of these teams — you can't tell Engineering what to prioritize or tell Legal to move faster. But you can manage the interface. You can document expectations. You can establish cadences.
Talk track: You can build enough trust that when something urgent comes up, you're not starting from zero. The managers who struggle most are the ones who treat these relationships as optional — who focus entirely on their own team and then are shocked when Engineering changes a schema without telling them. The relationships exist whether you manage them or not. The only question is whether you're proactive or reactive.
Talk track: Let me make this concrete with real failure stories I've seen. Scenario one: Engineering ships a database migration over the weekend. They rename three columns. Your dbt models reference the old names. Your pipeline fails at 2 AM. Your morning dashboards show stale data — or worse, errors. The CEO pulls up the revenue dashboard at 8 AM and asks why it's blank. You find out about the schema change from the CEO, not from Engineering. Scenario two: you spend three months evaluating and implementing a new analytics tool. You've migrated data, built dashboards, trained your team. Then Legal tells you the tool's data processing agreement doesn't meet GDPR requirements. Three months of work, unwound.
Talk track: Scenario three: your BI tool contract is up for renewal. You assumed Finance would approve it — it's the same cost as last year. But Finance is cutting vendor spend across the board. Your license lapses. Dashboards go dark. Scenario four: Product ships a major new feature. The VP of Product asks you to report on adoption. But nobody added event tracking. You have zero data. Every one of these is preventable with a written SLA and a regular check-in.
Talk track: Let's get specific about what each team needs from analytics. Product needs metrics definitions — and they need those definitions to be stable, documented, and trustworthy. Engineering needs data contracts — agreements about what data you'll produce, what schema it follows, and what the SLAs are. Legal needs a data inventory — what PII do you have, where does it live, who can access it, and when does it get deleted?
Talk track: Finance needs to know what your stack costs and whether it's worth it. IT needs to know that every vendor you onboard meets their security requirements. Design needs user behavior data and funnel analysis to inform UX decisions.
Talk track: Notice the pattern here. Everyone needs documentation and agreements from you. Not just insights and dashboards — operating agreements. Product doesn't just need you to answer questions. They need to know how long it takes you to answer questions, and what format the answer comes in. Engineering doesn't just need you to consume their data. They need to know what load your queries put on their systems, and what you'll do when something breaks. This is why we spent time on the Manager OS in Block A. The documents you produce — data contracts, metric definitions, cost reports — are the glue that holds these relationships together. Without them, every interaction is ad hoc, and ad hoc doesn't scale.
Talk track: Now flip it around. What do you need from them? Product needs to give you clear, prioritized business questions — not vague "can you look at the data?" requests. Engineering needs to tell you when they change schemas — because if they rename a field and your pipeline breaks, that's on both of you. Legal needs to be responsive on privacy reviews, not a 6-week black hole. IT needs to provision cloud access in days, not months.
Talk track: Here's the big idea: document these as bidirectional SLAs. Literally write them down. "Engineering will notify analytics 2 sprints before any schema change to tracked events. Analytics will provide a data impact assessment within 3 business days of any privacy review request." That's not bureaucracy. That's how adults collaborate. Who has been burned by a surprise schema change?
Talk track: Let me give you a practical template for building these SLAs. It's dead simple — four lines. What you'll provide and by when. What they'll provide and by when. Who to escalate to when someone misses. And how often you review the agreement. The review cadence is critical — monthly is right for most teams. It gives you a regular forum to say "this is working" or "this isn't working, let's adjust."
Talk track: The secret to SLAs that actually stick is that both sides need to see value. If the SLA only benefits you, the partner team will ignore it within a month. Start with the most critical relationship — usually Engineering — and get that SLA working well before expanding to other teams.
Talk track: Here's a concrete example for the Engineering relationship. You commit to reviewing schema change impacts within three business days, and they commit to notifying you two sprints in advance. Both sides benefit. Engineering gets a fast, predictable turnaround on impact assessments, and you get the advance notice you need to update pipelines before anything breaks.
Talk track: I recommend drafting one of these for each of your key partners — Engineering, IT, Legal, Product. It takes about thirty minutes per SLA and the return on that investment is enormous. These documents transform ad hoc relationships into reliable operating agreements.
Talk track: Here's the big picture. Every modern data stack follows this flow: you have sources of data — your product database, event streams, third-party APIs. You ingest that data into a central store. You store it — in a warehouse or lakehouse. You transform it — clean it, model it, make it trustworthy. You build a semantic layer — agreed-upon definitions of your metrics. And you visualize it — dashboards, reports, self-serve exploration. Across all of these layers, you have three cross-cutting concerns: governance, observability, and cost.
Talk track: This diagram is the anchor for the rest of this block. When we talk about specific tools, I want you to always ask: where does this fit in the flow? Print this mental model. Tattoo it on your arm. Whatever it takes.
Talk track: If you come from a business or social science background and architecture diagrams feel intimidating, this slide is for you. Each layer does one thing. Sources are where data comes from — your product database, event tracking, third-party APIs. Ingestion is the plumbing that moves data from those sources into your central storage. Storage is the warehouse — think of it as a really big, really fast database optimized for analytics queries.
Talk track: Transform is where you clean and model the raw data — turning messy event logs into clean tables your analysts can use. The semantic layer is where you define what "active user" means or how "revenue" is calculated, so everyone in the company uses the same number. Visualization is the dashboards your stakeholders actually see.
Talk track: Train yourself to ask these three questions at every layer of the stack. How do we know it's working — that's observability. What happens when it breaks — that's resilience. What does it cost — that's cost management. If you can ask those three questions intelligently about each layer, you can hold your own in any infrastructure conversation.
Talk track: If you're in the small startup case context, this is your stack. I want to be very direct: your goal is trusted numbers, fast. Not a beautiful architecture diagram. Not a resume-driven stack. You need to get data from your product database into a warehouse, transform it so it's trustworthy, and put it in a dashboard that your CEO can look at every Monday. Fivetran or Airbyte for ingestion — don't build your own connectors, that's a waste of your time. BigQuery or Snowflake for storage — both have generous free tiers.
Talk track: dbt for transformation — it's SQL, it's version-controlled, it's testable. Metabase or Preset for visualization. And governance? At this stage, governance is a naming convention document and a spreadsheet that says who owns each table. That's it. Don't let anyone tell you that you need a data catalog at this stage. You don't. Total cost for this stack: under two thousand a month. That's incredibly affordable for a startup.
Talk track: Now here's the discipline part — what to skip. You don't need a data catalog when you have fifty tables. You don't need an ML platform when you don't have ML models in production. You don't need a feature store when you have one data scientist.
Talk track: I know it's tempting to build the "right" architecture from day one. Fight that instinct. Every tool you add is a tool someone has to configure, maintain, and debug. At a small org, that someone is probably you or your one data engineer. Every hour they spend maintaining infrastructure is an hour they're not answering business questions. Add complexity when you feel real pain, not anticipated pain.
Talk track: Now you're a Series B company. You have product-market fit, you're scaling, and PMs are asking for self-serve analytics and real experimentation. The small stack isn't enough anymore. You add an event platform — Segment or RudderStack — because you need structured event collection and identity resolution across platforms. You add experimentation tooling because your growth team needs real A/B tests, not eyeballing a chart. You add a semantic layer so there's one agreed-upon definition of "active user" and "revenue."
Talk track: You add a data catalog because there are now enough tables that people can't find what they need. And critically, you add observability — Monte Carlo or Elementary — so that you know when data is broken before the CEO messages you on Slack at 8 AM asking why the dashboard looks weird. Governance gets real at this stage. You need formal data owners — someone whose name is attached to each critical table and who is accountable when it breaks. Monthly cost jumps to ten to thirty thousand.
Talk track: That's a real budget line item. But let me reframe the cost: what's the cost of a wrong number reaching the board? What's the cost of a data breach because everyone had access to PII? What's the cost of your best analyst spending two hours every morning checking whether the data is fresh? The tools pay for themselves if they prevent even one of those scenarios.
Talk track: At enterprise scale, the challenges are fundamentally different. You're not trying to get trusted numbers — you're trying to maintain trust across dozens of teams, petabytes of data, and strict regulatory requirements. Data mesh becomes relevant — decentralized ownership with centralized standards. You need an ML platform because you have multiple models in production. You need a feature store because different teams keep re-deriving the same features.
Talk track: You need model governance — model cards, bias audits, monitoring — because regulators and your own ethics demand it. And you absolutely need cost management because at a hundred thousand a month, someone is going to ask where the money goes. Monthly cost is a hundred thousand dollars or more. That sounds like a lot, but at this scale you're talking about a company with hundreds of millions or billions in revenue. The data infrastructure is a fraction of a percent of revenue.
Talk track: Governance at the large org level is a team unto itself. You need dedicated people thinking about access controls, data lineage, regulatory compliance, and audit trails. The governance stack — your catalog, your lineage tool, your access control layer — becomes its own mini-infrastructure that needs its own maintenance and its own budget. The real challenge for large enterprise analytics managers isn't the money — it's the organizational complexity. Getting approval for a new tool involves procurement, legal, security, IT, and finance. That's five teams that all need to say yes before you can buy anything.
Talk track: I want to talk about what happens when you outgrow your stack, because it will happen. Story one: you started on spreadsheets and now you need a real warehouse. This is actually the easiest migration — you're going from nothing to something. It's painful, it takes two to three months, but it's a one-way door and everyone agrees it needs to happen. Story two: you need to move from one warehouse to another. This is brutal. Every SQL query is slightly different across warehouses. Every dashboard connection needs to be reconfigured. Every integration needs to be re-tested. I've seen this take six to twelve months, and during that time your team is essentially maintaining two systems.
Talk track: Story three — and this is the most painful — you built something custom. Maybe it was a brilliant engineer who built a custom ingestion pipeline. Then that engineer left. Now nobody understands how it works, it breaks regularly, and nobody can fix it properly. You're stuck. The lesson: when you choose a tool today, ask yourself how hard it will be to leave that tool in three years. Avoid proprietary lock-in where you can. Use standard SQL. Use standard file formats. Make migration a design criterion, not an afterthought.
Talk track: Build versus buy is one of the most consequential decisions you'll make as an analytics leader. Here's a framework with four questions. Question one: is this a core differentiator? If your company's competitive advantage depends on a proprietary recommendation engine, build it. But your data warehouse is not a differentiator. Nobody ever won a market because they had a better warehouse. Question two: is this commodity infrastructure? Ingestion, storage, BI tools — these are solved problems. Buy them. Question three: do you have the team to maintain it? Open source is not free. Airflow is not free. It requires engineers to keep it running, upgrade it, fix it when it breaks. If you don't have those engineers, buy a managed service.
Talk track: Question four: what's the total cost of ownership? Add up the license, implementation, training, maintenance, and — this is the one people forget — opportunity cost. Every hour your data engineer spends maintaining Airflow is an hour they're not building the pipeline your PM needs. The rule of thumb is simple: default to buy for infrastructure, default to build for business logic. Infrastructure is commodity — someone else has already solved the hard problems. Business logic is where your team adds unique value. That's where you want your engineers spending their time.
Talk track: This is the slide I wish someone had shown me earlier in my career. Open source is not free. Let me walk through a concrete example. Airflow is the most popular open-source orchestration tool. The license costs zero dollars. But setting it up takes forty to eighty hours of a data engineer's time.
Talk track: Maintaining it — keeping it running, monitoring it, fixing it when workers crash — takes eight to sixteen hours a month. Upgrading it when a new version comes out is a multi-day project. And when it breaks at 2 AM, your engineer is the one getting paged.
Talk track: Add it up: you're spending thirty to fifty thousand dollars a year in engineer salary to run a "free" tool. A managed alternative costs four to fourteen thousand a year in subscription fees. And your engineer spends almost zero time on maintenance, so they can build pipelines instead of babysitting infrastructure. The only time self-hosting makes economic sense is when you have a dedicated platform team that's already managing Kubernetes and cloud infrastructure. For most analytics teams, especially small and medium ones, managed services are cheaper than free.
Talk track: When you're spending more than ten thousand a year or when PII is involved, you need a structured process. This is the RFP — request for proposal. Step one: define your requirements. Separate them into must-have, should-have, and nice-to-have. This is critical because vendors will try to dazzle you with features you don't need. Step two: weight the criteria. Security might be worth 25% of the score for one org and 10% for another. Step three: score each vendor on each criterion. Use a 1-to-5 scale and do it as a team so you get multiple perspectives. Step four: negotiate. I cannot stress this enough — always negotiate. SaaS pricing is almost never firm. Step five: pilot. Never commit to an annual contract without a proof of concept. Two to four weeks, real data, real users. If the vendor won't let you pilot, that tells you something.
Talk track: Here are the scoring dimensions I recommend for any vendor evaluation. Functionality — does it actually solve the problem you're trying to solve? Integration — does it connect to your existing warehouse, your existing BI tool, your existing SSO? Security — does it have SOC 2 Type II, does it encrypt data at rest and in transit, does it support role-based access controls?
Talk track: Cost — not just the sticker price, but the total cost of ownership over three years including implementation and training. Support — what happens when it breaks? Do you get a dedicated customer success manager, or do you file a ticket and wait a week? Scalability — if your data volume doubles in a year, will this tool handle it?
Talk track: I recommend scoring as a team. Have each evaluator score independently, then come together to discuss disagreements. That discussion is incredibly valuable — it reveals assumptions and priorities you didn't know existed.
Talk track: Here's where most analytics managers fail: they find a tool, fall in love with it, start a trial, build dependencies on it, and then discover that IT won't approve it because it doesn't support SSO, or it stores data outside the EU, or it hasn't completed a SOC 2 audit. Now you've wasted three months. Instead, learn to speak IT's language. SOC 2 Type II means the vendor has been independently audited for security practices — over time, not just at a point in time. Data residency means where the bits physically live — and under GDPR, EU personal data generally needs to stay in the EU. SSO means your team logs in with their company credentials, not a separate password. Audit logs mean you can prove who accessed what data and when. Learn these terms. Use them in your first conversation with IT. They will immediately take you more seriously.
Talk track: And here's the timeline reality: procurement takes two to six times longer than you expect. I've seen analytics managers plan to have a new tool deployed in a month. Then they discover they need security review, which takes three weeks. Then legal review of the contract, which takes two weeks. Then finance approval, which requires budget committee sign-off, which happens monthly. Then SSO integration, which requires IT to configure. What they thought was one month is actually four months.
Talk track: Start the conversation with IT in month one, not month six. And here's the golden rule: involve IT early. Not as a blocker — as a partner. If you bring IT into the evaluation process from the beginning, they can flag issues early, before you've invested months of work. The worst possible approach is to pick a tool, start using it, and then ask IT for permission after the fact. That's how you become the team that IT says no to.
Talk track: I want to show you what the procurement timeline actually looks like, because most analytics managers dramatically underestimate it. Month one: you identify the need and start researching vendors. Critically, you loop in IT from day one. Month one to two: you run demos and narrow down to two or three finalists. Month two to three: you run a pilot with real data. This is where you find out if the tool actually works for your use case, not just in a sales demo. Month three: security review. The vendor fills out your company's security questionnaire. IT reviews their SOC 2 report. This alone can take three to four weeks.
Talk track: Month three to four: legal reviews the contract and the data processing agreement. They'll have questions. There will be redlines. Month four: budget approval. If you've missed the budget cycle, you might wait months. Month four to five: IT configures SSO and provisions access. Month five to six: you actually deploy and train your team. Total elapsed time: three to six months. If you need a tool operational by Q3, you need to start this process in Q1. Most analytics managers start in Q2 and then are surprised when the tool isn't ready until Q4. Plan backwards from your deadline.
Talk track: Privacy is not optional and it's not someone else's problem. As an analytics manager, you are directly responsible for how your team collects, stores, and uses personal data. GDPR gives us four concepts you need to internalize. Lawful basis — you need a legal reason to process personal data. "We want to analyze it" is not automatically sufficient. Data minimization — collect only what you need for a specific purpose. I've seen teams logging every click, every scroll position, every mouse movement "just in case." That's a liability, not an asset. Right to erasure — if a user asks to be deleted, your pipeline needs to handle it. If your analytics warehouse can't delete a specific user's data, you have a compliance problem. And consent management — track what users agreed to and don't exceed that scope.
Talk track: Data classification is something every analytics manager should implement, even at a small company. Four tiers. Public data can be shared openly — your blog traffic numbers, public benchmarks. Internal data is available company-wide — aggregate dashboards, high-level metrics. Confidential data is need-to-know — user-level behavioral data, financial details, anything that could be sensitive if leaked. Restricted data is the highest tier — personally identifiable information, health data, payment data. This should have the strictest access controls and the shortest retention periods.
Talk track: Speaking of retention: don't keep data forever. Every byte you store is a byte that could be leaked, a byte that regulators can ask about, and a byte that costs money to store. Set retention policies for every data tier. Work with Legal to determine the right retention period. Then build automated deletion into your pipelines. I've seen companies get into serious trouble because they kept data for years "just in case" and then couldn't explain why they had it when a regulator asked.
Talk track: Time for hands-on work. Open the data infrastructure blueprint template. You have 25 minutes. I want you to write a one-page plan for your case context's data infrastructure over the next six months. This is a real deliverable — something you could hand to a VP of Engineering or a CTO and say "here's what we need, here's what it costs, here's the plan." Be specific about tools. Don't write "we need a warehouse" — write "we'll use BigQuery because it has a generous free tier and our team already knows SQL."
Talk track: Use the cost estimates from the slides as a starting point. Small context people: your one-pager should be simple. Three or four tools, under two thousand a month. Large context people: your challenge is scoping down to what's achievable in six months. Make sure you include a phased implementation plan and a realistic budget. The risks section is critical — think about what dependencies you have on other teams and what could delay your timeline. Go. I'll circulate.
Talk track: Second activity. Keep your one-pager open — now pick one tool category from your infrastructure plan and do a real vendor evaluation. Pick a category where you have a genuine choice to make. If you're in the small context, maybe it's warehouse or BI tool. If you're in the medium context, maybe it's experimentation or event platform. If you're in the large context, maybe it's data catalog or ML platform.
Talk track: Use real vendors, real criteria, real-ish pricing. The point of this exercise is to practice the discipline of structured evaluation. It's incredibly tempting to just pick the tool you've heard of or the one your friend uses. The scoring matrix forces you to be rigorous. Make sure you weight your criteria — security and functionality should probably carry more weight than nice-to-have features. Twenty minutes. Go.
Talk track: Let's come back together. I want to hear from a few people. First — what trade-offs did you make? Someone in the small context: what did you decide to skip, and did it feel uncomfortable? Good. That discomfort means you understand what you're giving up. Now — where did build-vs-buy get hard? Who found a case where the answer wasn't obvious? That's the real world. The framework helps, but it doesn't make the decision for you. Finally — look at your infrastructure plan through IT's eyes. What would they push back on? If you picked a tool without SOC 2, they'll push back. If you picked a tool that stores data outside the EU, they'll push back.
Talk track: Here's the key takeaway from the debrief. There is no universally correct data stack. The right answer depends on your team size, your budget, your timeline, and your regulatory environment. The framework — build-vs-buy, TCO analysis, structured vendor scoring — is the answer. It's what lets you make a defensible, well-reasoned decision in any context. The scoring matrix is your tool for having that conversation with evidence, not opinions.
Talk track: Let's take stock. You now have a practical toolkit for infrastructure decisions. You have a cross-functional map that documents your dependencies. You have the canonical blueprint that organizes any data stack. You have a build-vs-buy framework for making tool decisions.
Talk track: You have a vendor scoring process. You have a privacy and governance foundation. And you have a six-month plan specific to your case context. Both the infrastructure blueprint and the vendor scoring matrix are deliverables in your final portfolio. Refine them this week using the feedback you got today.
Talk track: Final block is coming up — the QBR simulation. This is where everything comes together. You're going to present your roadmap and narrative to a mock executive panel. Take the next ten minutes — before we start Block F — to prep. Pull up your roadmap from Day 1. Incorporate the infrastructure plan you just wrote.
Talk track: Think about the story you're telling: what are we building, why does it matter, what does it cost, and when will we see results. Anticipate the hard questions. If you recommend Snowflake, be ready for "why not BigQuery — it's cheaper." If you recommend building something in-house, be ready for "do we have the team for that?" Think like your audience. See you at 3:30.