The Engineering Leadership Podcast · Episode 31

How to Lead Large Scale Projects

with Wendy Shepperd GVP Engineering @ New Relic

Mar 09, 2021

Wendy Shepperd, GVP Engineering @ New Relic covers how she leads massive-scale strategic programs with precision execution. You’ll hear how to lead great project kickoffs, successfully execute across multi-year time horizons, navigate complex dependencies, launch, & of course celebrate success with your team.

Listen On

supported by

Empower your team to drive greater business impact & attend GLOW 2024!

Learn more >

NEWSLETTER

EVENTS, PODCASTS AND MORE

NEWSLETTER

EVENTS, PODCASTS AND MORE

SPEAKER

Wendy Shepperd - GVP Engineering @ New Relic

Wendy Shepperd, GVP Engineering, leads product development for the New Relic Telemetry Data Platform, the leading SaaS multi-tenant observability platform used by tens of thousands of engineers to build and operate more perfect software for their customers. She also oversees global infrastructure, architecture, managed services, and engineering operations for New Relic.

"One of the things I tell my teams all the time, especially when things get hard... is I say, 'Hey, y'all we are making memories! One day we're going to look back on this time and talk about what we learned and laugh about it and share stories about it.'"
- Wendy Shepperd

Having worked with a variety of organizations from start-ups to billion-dollar companies, Wendy brings a unique perspective on what works well for different types of situations and at different stages of growth. She loves growing leaders and building winning teams that execute with precision. Currently leading her fourth multi-million dollar cloud migration, Wendy has developed deep expertise and a fair amount of scar tissue around the many aspects of complex platform migrations in large-scale environments.

Follow Wendy on Twitter @WendyShepperd

Check out our friends and sponsor, Jellyfish!

Jellyfish helps you align engineering work with business priorities and enables you to make better strategic decisions.

Learn more at Jellyfish.co/elc

Show Notes

Wendy's lessons learned after leading 4 multi-million dollar cloud migrations (2:29)
How to decide when to fix-forward or roll-back your complex strategic project (6:36)
What great project execution looks like in a complex cloud migration (9:37)
Wendy’s keys to a smooth cloud migration & the major steps of strategic planning (13:57)
How to lead a successful planning phase for your project (16:28)
How to resolve conflict when your teams and projects have different priorities (18:46)
How to move your project forward despite unknown dependencies (22:22)
How to hold a great project kickoff and what to avoid (26:59)
How to sustain momentum and successfully execute large, complex projects with precision (31:33)
Wendy’s team, leadership, and communication cadence during Project Cumulus (37:19)
How to handle real-time coordination and execution during pre and post-project launch phases (39:27)
Wendy’s favorite project milestone celebration (42:33)
How Wendy’s past work as a technical writer impacts her leadership now (44:54)
Wendy’s less obvious, but essential advice to lead large, complex projects (46:43)
Takeaways (48:56)

Transcript

Wendy's lessons learned after leading 4 multi-million dollar cloud migrations

Wendy, you had mentioned you're in the middle of leading your fourth multi-million dollar cloud migration. My reaction to that is that sounds like... A lot of anxiety that stresses me out. But clearly, you know, this is something where you really enjoy the high stakes, the highly complex, the large scale world of strategic change and precision execution.

You've also shared with us that these experiences have, come with their own fair share of scar tissue. I wanted to talk about some of that scar tissue what created the scar tissue? Is there a particular experience that was more painful or didn't go as planned?

Wendy Shepperd: Definitely, as you said, cloud migrations they're complicated. They're high stakes. They do cause different amounts of anxiety. When I think about some of the things that have gone wrong in previous cloud migrations, one in particular comes to mind at a past company where we were doing the migration and everything had been working fine.

In staging, we'd spent several months getting ready. And when it was time for launch, we launched on the new platform and we started onboarding customers and everything seemed to be going fine. It was going fine for a couple of weeks and we were high fiving. We were celebrating our successes.

And then in week three everything started going wrong. Our performance was starting to cause outages and data loss and data lag. And customers were getting very frustrated. We were getting paged at all hours of the night and we ended up spending two solid weeks, I think, which were nearly 24 by seven days. Myself as well as my team, trying to figure out what was going on and what was going wrong, because we hadn't seen anything like this at all in staging.

And in that particular migration, we ended up having to stop and roll customers back to the previous platform and go back and spend another couple of months. It's getting to the bottom of what was causing our performance issues and then do a relaunch. That definitely left some scar tissue and has made me I take a little bit more time before I'm ready to call a win a win. (laughing)

Patrick Gallagher: How would you describe like the mood and the emotions on the team at that realization moment on week three, where you were navigating and trying to figure out, "Oh, shoot, we actually have to roll this back..." What was the mood in the team like?

Wendy Shepperd: That was really tough. It, and it was a really tough decision because in any situation like that, you always have the combination of, do we fix forward? Or do we roll back? And that's of course a much easier decision in a deployment than it is in an actual migration, wherever you've moved customers and their data from one platform to another. So the mood was one of fear, anxiety, concern, people were exhausted. We had angry customers, we had disappointed executives.

And so the mood was, people were very stressed out. And we really had to weigh the pros and cons of moving forward versus going back to the old platform.

And so it was a very intense situation, a lot of different conversations, a lot of different opinions about what we should do. And I was working to navigate all the different conversation and the different stakeholders and customers and so forth.

Ultimately, we decided that the best thing for the customers, and you always have to think about what's really best and the least impact for the customers... was to go back to the previous platform. Get things stabilized based on what we learned and then try again.

And when we did try again, in that particular case we were successful, we got to the root of what the architectural issue was that we missed during staging that manifested itself in production. Which I think a lot of engineering's a lot of engineering leaders can relate in that, no matter how beefy or how good your staging environment is, it's never exactly like the production environment.

How to decide when to fix-forward or roll back your complex strategic project

Jerry Li: I can imagine that decision to roll back is a tough one. Not because the potential impact, but also all the information you need to surface. And the number of people are getting involved to make an informed decision. I imagine that's a multi team or multi organization collaboration.

Can you tell us a little bit more about how that decision was made?

Wendy Shepperd: First off we were definitely trying to minimize the impact to the customers. And since we were already two weeks in and we had migrated a small percentage of our customers, we immediately moved those customers back. that was something that was a pretty quick decision that didn't take a lot of us to get aligned on because after several days of outages, we realized like we just, weren't going to be able to fix the situation right then.

So the decision to move back in the short term with that set of customers was fairly straightforward. Where the debate came in was what to do next. And that was a combination of me as the engineering leader, our lead architects, and also some of our like key executives, finance. Other decision-makers.

And so we had a series of what I would say, architectural engineering meetings, where we laid out the pros, the cons, and really had to bring data forth explaining what the issues were. And so that's always a big challenge because sometimes you don't know exactly what the issue is and your stakeholders want you to know exactly what's wrong. When are you going to fix it? How long is it going to take?

And so part of the conversation was before me coming back and presenting to the various stakeholders is getting that data together to the best of our ability. And then coming back with a series of trade-offs, options scenarios... and then letting ultimately the stakeholders choose that scenario.

This is something that happens pretty frequently in different types of projects. And what I like to do is put the hands in the decision-makers, but I like to bring multiple scenarios of "here's the pros and cons of each scenario. And as an engineering organization, Here's what we recommend."

And of course we're usually hopeful that the stakeholders will go with our recommendation, but sometimes we have to look at the different trade-offs that they bring as well, as far as finance or customers. But it really involves all aspects of the organization there. customer support leaders, the product leaders, the finance leaders, the CTO, and rest of the C suite, as well as the engineering and architectural leadership.

Patrick Gallagher: That seems like a really empowering way to collaborate with the different stakeholders in coming at them as partners in making the decision to move forward.

I know for me, when I think about a big project, there is a lot of the fear of "what happens if this doesn't go well? And will it be this supremely painful thing that I can't handle?"

And so I think hearing your story and at least understanding how you all dealt with that... is really empowering for me in how I handle a big project like that. And so thank you for, for sharing that.

What great project execution looks like in a complex cloud migration

To get to the other side of it... obviously it's not all scars and bruises and things like that.

So outside of that experience, what have been some of the well-executed great projects?

Wendy Shepperd: One project I would highlight is one that we're working on right now at New Relic, it's called Project Cumulus and it's our cloud migration project. So project name: Cumulus. And we are in year one of a multi-year effort to migrate all New Relic products and services and platforms from our on-prem data centers, our private clouds to public cloud.

And one of the things that has worked well on this effort is one is everybody is aligned on the NEED to do it. So sometimes you have big programs and not everybody's behind the need.

So here the need was clear. With our explosive growth and our continued scaling as an organization. It was imperative we move, particularly our data platform and other services to the cloud. And so we quickly got aligned and we did some of the best practices that we'll talk about do this through the rest of this conversation.

But in year one for a very large data platform that we estimated would take one year, we are nine months in... 70% of our data is migrated. And we're on track to finish this quarter.

And. I don't want to jinx it... cause we still had, do have a few more months of work left to do... However, this being the fourth cloud migration, I would say so far, it's one of the smoothest that I've been a part of. And we're very proud to, at this point in the project to still be on track with what we committed.

Jerry Li: To have the audience get a better sense of the scope and the complexity of the project, can you share it a bit more on how many people are involved and, what would you say about the complexity?

Wendy Shepperd: Well, Over there. We have over a hundred engineering teams at New Relic and our product organization. And in the group I run, which is our telemetry data platform group we have 25 engineering teams. To give you an idea of the scope that we're dealing with... we ingest over a billion data points per minute into what we call NRDB, the New Relic database platform.

And we query multiple petabytes of data and return those queries in milliseconds. And this is everything from the data ingest, to the streaming, to the storage, to the query, everything around security and availability. And EVERY product at New Relic relies on the telemetry data platform. As well as we have over 300 integrations with other technologies. Like we integrate directly with Prometheus, directly with Grafana. We have all the New Relic agents that inject data into the platform. And so it also runs, across four data centers in multiple regions.

And so the lift to move the volume of the data AND maintain the integrity of the data while it's being moved and while it's spread across our data center and the cloud... also really increases the complexity because we have to move everything while also keep everything up and running with no interruptions to service.

Jerry Li: So that means most of the 25 teams that you are leading are involved in that project.

Wendy Shepperd: Yes, when I talk about the 25 teams that are in the group that I'm responsible for. There are two major focus areas. One is the data platform itself, which I just described. And also my group provides all of the tooling and enablement for all of the other product teams at New Relic. So we have a system called "Grand Central", which is our centralized deployment technology and deployment tools.

We have a whole infrastructure and group called "Container Fabric" which has all of our container orchestration, our Kubernetes clusters, our Kafka clusters. Also consumed by the data platform as well as consumed by other New Relic products.

So it's a tale of two migrations in that we need to migrate the actual data and the platform. And then we also need to migrate all of our deployment tooling, all of our orchestration, our Kafka, our Kubernetes, and so on.

Wendy's keys to a smooth cloud migration & the major steps of strategic planning

Jerry Li: And you mentioned, this is the the most smooth migration so far. What are the things or processes you putting in place that made it a reality? Because what I know from my past experience at Amazon and Groupon and from conversations with many other engineering leaders... cross team collaboration, working on a large project like this... is really challenging. Not just the technology, not just the amount of work, but also getting things aligned among teams and just collaboration is hard in general.

Wendy Shepperd: So several things. I mean, one of the key things of any successful project is communication. And so communication really weaves into every phase and you need to communicate to different audiences in different forums.

In this case, we set up those communication constructs very early on. As far as establishing, in our case, we use Confluence as a place to store shared information. We use different tools via email, Google docs. We have a called Co-On that we use for reporting status.

Some of those are geared at executive audiences. So I provide a weekly update to our executive team, including the CFO and the president and the CEO about what we're doing, which I take as a roll-up.

We have program managers that are communicating out to all the stakeholders and the consumers of the tooling that we're building. We have the internal communications with all the different engineering teams and aligning what they're building when. Architecture, artifacts, design documents.

that's all around the communication aspects. And then we can get into this more in the conversation, but also really successfully going through each of the major steps of strategic planning.

So you've got. Kickoff planning, defining that shared vision. You've got the execution piece, then you've got launch, you've got post-launch.

And of course they're not linear, especially with such large projects. You're typically running in and out of those phases. But one of the things that's really helped us be successful, I would say so far in year one of our migration... Is following those best practices.

And it doesn't mean that we haven't had anything go wrong. We've had some things go wrong. So sometimes you get ahead on one, you know, one thing takes longer than you planned and something else takes less time than you planned.

And so overall I'd say we're kind of running it even par as far as where we've had bogeys and where we've had birdies in our migration. For those of you golfers out there.

How to lead a successful planning phase for your project

Jerry Li: Among those steps, you laid out planning, kickoff execution, launch post launch for planning what are the pro-tips that can save a lot of time down the road?

Wendy Shepperd: Sure. Number one is you have to clarify the purpose, the vision and the scope. Of what you're trying to do and you need to be clear on who you need to clarify that with. Right. So

Jerry Li: Can you expand on that and even more? That's an interesting perspective.

Wendy Shepperd: Yes. So , for example, with our Cumulus project, there was going to be an important element of finance, right? This is a change to the New Relic business model. We moved from data centers, which are a capital intensive business model to a cloud model, which is an operational expense business model. So from CapEx to OPEX.

So it's really important that we brought finance into the stake holder "Cumulus leadership council" that we established so that they would understand and be able to guide us on the different impacts of decisions that we made.

It was also very important to pull in all of the architects from all the major components across New Relic. So I mentioned there's data platform, there's tooling. Then there are all of our other products full stack observability and artificial intelligence, operations. And really, we had to have all the general managers onboard and contributing to the vision and supporting that migration and the approach to that migration and get the buy-in there.

We also had to pull in the various engineering leaders who would be responsible for executing against the plan. And in addition to clarifying purpose, vision and scope from a narrative perspective... we also had to work together with all of these people across the different organizations on HOW we would measure success, because success can mean different things to different people.

Success means, one thing to finance and then perhaps another thing to product and another thing to support. And of course also we want to make sure, like one of the key success criteria is that this is seamless to our customers. And so that customers aren't impacted.

Those are some of the things and people that we had to pull together to gain that alignment.

How to resolve conflict when your teams and projects have different priorities

Jerry Li: I know in the planning phase, one of the common challenges are the priorities across different teams are different. It's not like the whole company is working on only one project. Different teams may have different priority criteria. When there is a conflict. how do you get past that?

Wendy Shepperd: Yeah, that's a great question, Jerry and a very realistic one.

Managing dependencies is one of the most challenging aspects of any major strategic program. And in the case of our Cumulus project it's a big one because cloud migration is it's an internal infrastructure project.

While it does bring us the abilities to scale and do new and interesting things for our customers. It is also a different roadmap than the roadmap where we're building new features and capabilities and bringing those to market for the customers. And we have to balance those two roadmaps in terms of scaling and driving revenue and driving growth and customer engagement.

Each team have their day to day priorities so one of the things that we had to do as we organized the program upfront, and as we go through each phase is to check in. So we have a, I have close relationships and connections with all the different general managers. And not just me, but me and my leadership team.

We lay out the things that we're going to be doing in which order. And work to understand what dependencies do we have on other groups? What dependencies do they have on us and work to align those? And of course, as we all know in everyday life, as soon as you get those things, Set. Well, then they change too. So you have to continue to come back.

We had a ... We had an example just recently, a few weeks ago, where one particular aspect of the program that we were delivering for the cloud migration. There was another part of the organization that was dependent on a piece of the functionality that we were delivering. And they were using it for something they were building into their product. Whereas we were looking at it as a piece of functionality for the migration.

Well, we had to slip the date of that particular interim milestone by two weeks. And in that particular case, we were not aware that this other team was depending on that functionality for a product they were trying to deliver. And so when we published our update through these different tools that we have, so everybody has visibility. We published an update that was going to change to 10 days later.

Which we didn't think was, I mean, it was a particularly big deal. Of course, we all want to hit our dates. But we were like, okay, we have to slip 10 days because we found this technical discovery. It resulted in a really significant escalation of the product leaders and, you know, a lot of furious Slack messages and then meetings to say, "Hey, wait a minute... we needed that functionality To ship this feature we're putting out and now you're putting us off a schedule that we committed to."

So we all got together in a room, all the right architects, the right engineering leaders and product leaders. And we found a compromise. We found a way that we could put out a portion of what we needed for the migration, because they didn't need everything that was in that milestone. They needed a piece of it. So we ended up breaking up that milestone into two milestones essentially. And got the one out within five days, not 10, but then they were able to absorb the five days instead of the 10. And then we were able to break it up. And then ultimately it worked out for everybody. But it was quite the stressful moment when the discovery came to light.

How to move your project forward despite unknown dependencies

Jerry Li: That's almost unavoidable, like have surprises along the way. One thing I want to learn more about it as well is when you try to analyze the dependencies from all different teams. it's almost impossible to get all the hundred percent dependencies laid out from the get-go.

So there will be a point where you will need to make a decision when it's good enough. So you can move forward

how do you make that decision?

Wendy Shepperd: Well when we lay out the plan there's a few things that you do.

For one is you need to get all the right stakeholders together as well as you know, that sounds fine. That sounds happy path. Cause honestly, stakeholders change too. Dependencies change. Priorities change. This is real life things change on the fly.

So you can use that to your advantage as well to say, "okay, I'm planning based on what I know today. I'm laying out the plan and the dependencies, as we understand them, we have our council... or whatever your equivalent is of a council for a program you may be running... and everybody agrees at that point in time that this is the best information we have. We're going to go forward with this plan and we will change as we need to."

One of the things that can be. A downfall is trying to go into more of a waterfall mentality and this, you know, perfect plan of everything's going to happen in this order. And we have to anticipate every possible thing that's going to happen... which isn't the best use of time and can hold you up too long in planning when, you know, that those things are going to change anyway.

So how do you know when it's right? You know, When it's right, when your key stakeholders and the people you've identified as the decision makers all, or maybe mostly agree.

In fact, let me come back to that, the difference between consent and consensus. So what do you want to do is get enough consent that people are either agreeing and committing. Or disagreeing and committing. But everybody's had a chance to give their opinion.

And then ultimately as the leader, in this case, I'm the executive sponsor. I ultimately then have to make the call and say, okay, it's time to go. We've listened to all the feedback. We've identified everything we know today, and now we're going to kick off and we're going to move forward.

How to hold a great project kickoff and what to avoid

Jerry Li:

Really appreciate the distinction between consent and consensus.

Once the planning phase is done, then go into the kickoff. For the kickoff, what are the things that typically can go wrong or things people need to watch out?

Wendy Shepperd: Yeah, So things that could go wrong at the kickoff, which things that wouldn't be great would be if people come to the kickoff and they're not clear why they're there or what it is, we're actually building. So if you've missed the step of clarifying the vision, and communicating out the vision and in anybody who's going to be a member of that kickoff hasn't been communicated to that would be something that could go wrong and go really wrong.

And in fact, I'll think back to the project that I had at Rackspace called Project Polaris . At the time we were building the brand new Rackspace public cloud. And I was leading one of the engineering organizations. There were several organizations involved in getting the public cloud shipped.

Well, there was an important aspect that we didn't address early on, and that was that we needed to change our entire usage and billing model. Well, most of us in engineering were really focused on building the products and getting the products to market.

Well, this was a fundamental business shift to go from traditional dedicated hosting to public cloud hosting. And so, I had to call a kickoff meeting together to bring all the product teams together to decide what was the standard usage format that we were going to feed into the existing billing system. And this would be the new cloud consumption-based usage.

And really everybody that came to the kickoff their mind was in the space of getting their products to market. And Finance, their mind was in needing to get that consumption-based billing model and sort of being really surprised that we weren't as far as long as they thought we were.

And so that was a case of where the kickoff was really too late. And the kickoff was also really stressful because it was something that not a lot of people were thinking about. And it was a little bit late in the game. And so you had a combination of stress, unclear expectations, and kind of a late in the phase need to do some quick rearchitecture and get some things integrated into roadmaps that were already very tight.

That was one of the most stressful times I can remember personally, AND as an organization of a big oops moment of, "yeah, we're all ready to ship the products, but if we can't bill and invoice and track the usage, then we're not going to be able to launch."

Jerry Li: Yeah, I've seen a lot of examples where a large project they don't think about, for example, accounting until it's too late. Because accounting is not a easily identified dependency, people just assume we kinda just happen naturally (Wendy laughs) and then not until after the launch and become a problem.

What a good kickoff meeting looks like ?

Wendy Shepperd: yeah, I love kickoff meetings. kick off meetings are one of the most exciting times of the project, because for the most part, everybody's like pumped. They're ready to go. Typically you've just picked your code name. Everybody knows who's going to be on the project.

I was, you know, before we had the conversation today, I started thinking about, Hey, what programs might I talk about today? The first thing I thought about was, well, it's second talk about Project Atlas, Project Hercules, Project Cumulus, Project Polaris. Like these are just a sampling of some of the code names of programs that I've worked on.

But the other thing that makes a really great kickoff is everybody really understanding their roles and responsibilities, coming prepared with that clear vision and desired outcome, the clear success criteria, a clear plan of "here's, how we're going to execute. Here's who's doing what"

I like to use things like DACI or RACI models. Some of you may have heard of that.

Good, Google it like DACI stands for a driver, approver, contributor, informed. There's a lot of different models like this,

But what makes a great kickoff is a clear plan, a clear vision, clear roles and responsibilities. A lot of excitement. So generally if you're in person... this has changed a little bit with COVID... generally in person, you would have like some great food and some great music and things like that. So you have to simulate that a little bit more in a COVID world. But having fun and having that excitement and that energy and that vision is what makes a good kickoff really fun.

Jerry Li: It looks like people really like clarity, and people like to be informed and consulted early.

Wendy Shepperd: Yes, indeed.

How to sustain momentum and successfully execute large, complex projects with precision

Jerry Li: So as things moving past the kickoff, people get excited, we have a codename. So to ensure the plan is successfully translated to good execution what are the tips you have or like both success or failure examples you can share?

Wendy Shepperd: Sure. So all the excitement everything's been defined and now you're, in the flow of executing. It's really important... this comes in the plan, but it really continues through execution... is breaking things down into manageable chunks. So for example, if we're going to move everything in New Relic to the cloud, that's a three year effort.

So first off, when you execute, you start off with what's the first thing we're going to do. What's the first iteration. And you want to get those early wins and create momentum. So I typically advise that you start with something that's lower risk and that can be a win that you can get feedback on quickly create that momentum.

So think of the concept of steel threads, vertical slices of functionality, build something that's end to end through the stack... If you're building something new or if you're on a, say a migration project... migrate a chunk of something in our case, here is a chunk of data and a chunk of customers, and to get them running in the new environment. So that you can get feedback.

Also that you can do demos. I'm a big believer in doing demos every two weeks so that you can demonstrate progress. You show your stakeholders what you're doing helps hold your team accountable when they know they need to do a demo every two weeks. So it helps with that discipline and accountability.

Also, it gives the team that pride in their work and then ability to show off their work. And then also at the end of every one of our demos, we have a segment, we called "stump the stakeholder." And so that gives the team a chance to ask the stakeholders questions as well. As far as things that may not be clear, or we need a trade-off in priorities, or we need to make a decision about something that changed in what we're doing. And so we want to get your feedback.

And then another important part of designing architecting iterating. Doing steal threads and demos is tracking those dependencies we talked about earlier. Who's dependent on us. Who are we dependent on coming up in the next iteration, making sure that we're ready for that.

And then another thing that I think is really important to keep the momentum through these large projects is to celebrate and have fun with it. Don't wait for the very end. I love you know, just some of the creativity I see in our demos where people are working in animations or, you know, we have the common wars between the cat people and the dog people. And some people have the dogs and their demos and some have the cats, and it just creates that comradery. You know, It's really hard work to do these big programs. So it's important to do these kinds of activities to keep the communication, keep alignment, but also keep momentum and some levity in the work.

Jerry Li: I think that the mindset of caring about people and how they feel and also the energy and camaraderie and that's important notion.

And anything else?

Wendy Shepperd: Yes, definitely. Well, testing... So I'll say testing three times making sure that you've got different environments. Like at New Relic we have a synthetics product, so we heavily leveraged synthetics to simulate transactions, simulate load. We have load test environments, performance test environments, integration test environments.

We are leveraging a cellular architecture for our cloud migration so that we can do that segmentation of data and customers and environments of R and D and staging and production. And so it's so critical, particularly with these large scale change projects to test that behavior. To do AB testing, to do dark launches.

Another important part is getting the customer feedback. And in our case, it's both internal customers that are using the tooling we build as well as external customers who are using that data.

Other technical must haves are writing down designs, writing down architecture, holding architectural reviews. For example, we had a case where we weren't aligned architecturally on a critical piece of functionality during a project that I was working on in a previous company and basically one group had an idea. This was a lack of clarity on what we were doing and because the architecture wasn't written down and agreed to by multiple groups one group built something and they thought that it was done... and I'll just tell you, it was a data to data transfer between two different environments.

And the group that was building the data to data transfer thought the reason they were building, it was for disaster recovery. So something went wrong. They would be able to move the data in that type of scenario. But the group who is going to be consuming that the functionality was expecting the data could be moved on the fly in live environments so that they could move things around within the infrastructure.

Jerry Li: That's entirely different.

Wendy Shepperd: Real time as they needed it. but then it just had this name, right. It was just it was just like data transfer. And so that was a case where we really learned the hard way that by using a simple pet name for a complex piece of functionality, that two different engineering teams thought two different things were happening.

And so if we had had more architectural reviews and really defined the vision of what that functionality were more clearly, we could have avoided that mistake.

Wendy's team, leadership, and communication cadences during Project Cumulus

Jerry Li: How do you control the cadence ?

Wendy Shepperd: So I think of cadence in different areas. So there's the cadence of communications. On our current Cumulus project, we do weekly internal blog post updates. We do a, every-other-week sync in with our Cumulus Governance Board, which is um, with our, our CEO and our CFO and our other key finance and business leaders to talk more about the macro aspects of the project, the financials of it. Maybe the different contracts that we're doing with vendors. So more of like the business aspects.

Then we have the team cadences. So because there's so many teams working on it, each team has their own individual process of doing their kind of sprint ceremonies and their sprint kickoffs and their sprint demos.

So that's happening. And then we also have a weekly Cumulus leadership sync meeting, which these are the execution leaders of the different engineering teams. And there's a group of us that represent architecture, product program management and engineering. We think weekly to talk about different issues. Talk about what's upcoming, just, it's a variable agenda every week.

And then we also have in my group every other week, we do an all hands meeting with the telemetry data platform group, which is about 25 engineering teams. And there, we also do a Cumulus update and talk about what's going on.

And so the cadence is... Depending on the type of communication, it might be weekly and might be every other week. It might be monthly, but it's so important to think about different audiences, different types of communications, and different channels for communications, because some people prefer Slack, some prefer email, some prefer attending an all hands or watching the recording later.

Especially we're a global company, like many, so a lot of communications too we record them and then that way people can watch the recording on their own time.

How to handle real-time coordination and execution during pre and post project launch phases

Jerry Li: And now the next phase is launch. For a large project like this, the launch coordination especially a lot of things are going to be handled on the fly in real time.

Do you have anything to share on that?

Wendy Shepperd: When it's time to launch, it's really important to think of launch in two phases. Now, remember we are iterating all along the way, but when I think about launch, I think about a prelaunch and then THE launch.

So pretty common practice in the industry. And we're no exception or on the other. Programs that I've led... is prelaunch typically includes early access customers or beta customers. And that's letting people in who have a understanding that you're not completely done and you want their help. You want their feedback. And they're going to be supportive and understanding.

And so having a prelaunch is, is super critical to ensuring success of the actual launch. And then when it is time to launch, I think another thing that I've seen go well and sometimes not so well is. What does it mean to be done? Done? So sometimes engineering may think, "Hey, the code is done. We're ready to launch." and not leave enough time in the schedule for all the go-to-market activities...

so the documentation, the training, the launch announcements, the blog posts, sales enablement depending on the type of project. And so it's really important that you have a launch phase that is built into your program.

And then when it really is time to launch then, As I mentioned earlier, just because you launch one day doesn't mean now you clap and celebrate and everybody goes on vacation. Like you really have to monitor that post-launch. And keep eyes and ears to the ground. Be watching um, using those observability tools like New Relic watching the usage, watching the uptake, watching the performance. And really keeping an eye that things are going well.

And I think then after, once you have successful, you see successful users and you have the key indicators that show your meeting and your success criteria... at that point, that can be when it's time to celebrate.

And another important part for me as post-launch. So besides the recognition rewards, sharing the stories giving out the swag, right?! Giving out the t-shirts with the code name on it and everything and... is really to it's important to run a retrospective within a couple of weeks of the launch.

You want to give people a break. Typically they've been pushing really hard, but before they forget everything. You want to have a retrospective about what went well, what didn't go well, what did we learn? And then typically you always have a set of fast follows. These are the trade-offs you made typically in order to make your date that you said those will be fast follows coming right after launch.

And it can be difficult sometimes to keep that momentum post-launch. When everybody's just ready to move on, but it's like, Nope, wait, remember we need to now plan to do those fast, follows or then also address things that are coming up in production that you didn't find before you launched.

Wendy's favorite project milestone celebration

Patrick Gallagher: Wendy I had a quick up question from some of the earlier things that we were talking about because you mentioned celebration. As such a powerful force for breaking down these huge projects.

And so I was curious to know what was maybe your favorite mid project celebration that helped break down one of those multi-year executions?

And I guess the moment that you're thinking back and remembering of like, this is amazing, I feel really good about this. Do you have a specific moment that comes to mind with that?

Wendy Shepperd: You know, I do. I think of one I think of one back at Rackspace days, you know, Rackspace was, was such a. Interesting place and an interesting venue, you know, and there's this place called the castle in San Antonio, which is an old converted shopping mall that was our office space. And so there were a lot of like really fun things we could do.

We had a big, giant slide that you could jump on going from the top floor down to the bottom floor. And then we were right on, we could get access down to the San Antonio Riverwalk. And so we planned an outing that was in the middle of one of the most stressful periods of building that new cloud. And we did a full day outing that included a scavenger hunt, and we broke people up into teams and there were things that were just spread out throughout the castle and throughout the Riverwalk where people had to go and find these things.

And because it was so intense and so interactive, I think people really did truly forget about the project for the day because they were trying to like go figure out what was the next clue and the next place they had to go find something. And we also deliberately mixed up the teams so that it was people that were all on the project, but they weren't all on the same immediate team.

And then afterward we had a big celebration of, food and drink and then we released people early for the day. And we did that on a Friday too so then they released into the weekend it was just really, you know, a good shot in the arm that we needed in the middle of the project and to remind ourselves, "Hey, we're people we're working on this together and we're having fun. And this is something that we're going to remember for the rest of our lives."

Yes.

Patrick Gallagher: What a great moment of decompression. Oh my gosh. I feel relaxed. Just like imagining myself on the Riverwalk, going through the scavenger hunt, Jerry, are you taking notes? It sounds like these are things we need to recreate. I had one more question, Wendy, and then I think Jerry's gonna wrap us up.

How Wendy's past work as a technical writer impacts her leadership now

Throughout discussing this whole topic, it is so clear that you are so intentional and focused really on clear communication. And it seems like that is something so important for you throughout this whole process is to be really intentional with how you bring people together and how you communicate your message.

I noticed that your earlier career, you have a lot of experience within the world of technical content and technical communication. And I was just curious if, if there was anything about that early experience that shapes how you communicate now as a senior engineering leader.

Wendy Shepperd: Yeah that's a great question... I could tell you must have looked at my background.

So indeed I was a technical writer and content developer early in my career. And it is really incredible how much I use those skills in management and leadership and in running strategic programs, because one of the things you learn in technical writing and content development is how to be concise, how to be data-driven and how to write things that are in a consumable way in different formats.

And so I think it does... something that it comes more naturally to me is to be able to think about content management systems and different audiences and writing those communications. It helps whether I'm writing performance reviews or writing an executive comm or sending message in a Slack channel. I think those are skills that I use on a daily basis. And I think sometimes it's underestimated.

We often say "technology is not the hardest problem here..." It's really about communications and building relationships. That is really the investment that you make that helps technology programs be the most successful.

Patrick Gallagher: That's an incredible quote. Technology is not the problem here. Jerry, do you want to, do you want to bring us home and wrap us up?

Wendy's less obvious, but essential advice for leading large, complex projects

Jerry Li: If there's only one advice you want to give to people that are working on a large complex projects. Something that is less obvious, but really important what that would be?

Wendy Shepperd: I think one of the most important things is... when things get tough and you feel stressed out and you feel worried about your dates and your budget, and you're getting grilled by your stakeholders is to continue to stay bold and stay confident and trust your instincts.

And also assume positive intent. People are not out to get you, even though it can feel like it. And I, myself... and it's not something I've mastered. I think it's something that I've gotten much better at... but I think when you get stressed and you feel challenged, you tend to get defensive. And so when you get defensive, that's not the most effective way to work through the challenges. It's most important to keep calm, assume positive intent. People have questions. They can't read your mind. And then use data and facts.

But most importantly, I think of all is. Be bold and be confident that even if you don't know what obstacles or challenges you're facing or how to solve the one that's right in front of you right now... know that you are going to work through it. And you're going to learn something from it.

One of the things I tell my teams all the time, especially when things get hard, is I say "Hey, y'all we are making memories! One day we're going to look back on this time and talk about what we learned and laugh about it and share stories about it."

Jerry Li: That's really resonates. I immediately think about the challenges we were going through last year, when we have to move our annual conference... where we spent a lot of effort... to a later date and eventually become a virtual event, there sounds really scary and stressful at the moment. But now looking back, those are fond memories.

Wendy Shepperd: Yes!

Jerry Li: So thank you for sharing all those two learnings. It's really important.

Wendy Shepperd:

Thank you. This was a great conversation. And I was able to relive a number of those memories during this conversation and think about how challenging that was and how I've pulled some of those lessons forward into things I do today.

more to listen

Influencing without authority: What every engineering leader can learn from security

with Srinath Kuruvadi

Vision-first leadership & reimagining product w/ “Shift Out,” Do-It-as-a-Service & Jobs-to-Be-Done

with John Amaral

Scaling decision-making, designing eng orgs for speed & scope as underrated leverage

with Dhruv Parthasarathy

AI Rewrote Leadership. Refactor Your Playbook. Learn more

Learn More

Home for engineering leaders