From Momentum to Mayhem: A Lesson in Resilient Project Management

lessons learned project recovery Jun 26, 2025

Authored By Phillip Henry

The project was humming along well.

I was brought in earlier in the year, sometime in April, to provide guidance and leadership, as the prior project manager had left the company. The role was a combination of Project Manager and Scrum Master. In essence, run the daily stand-up, work with the Product Owner, mitigate risks, and achieve the goal.

What was the goal? We needed to develop API functionality to work seamlessly with one of our partners. Not only did the project influence our relationship with our partner, but it also hurt our revenue if we didn’t meet the goal. A lot was riding on this effort.

The project had just entered Sprint 3 with a solid backlog of features and tasks that was, honestly, well organized. The initial two sprints were a little chaotic, but that’s not unexpected, and kind of the reason that I was brought in.

To add to the complexity, however, there was a deadline. My boss told me, “Figure out how many sprints we will need to deliver the entire project and plan that timing out. Oh, but we need this done by the end of the year.”

Agile methodology, but with a deadline. Got it. Not unheard of in our field. So, I broke down the existing two sprints, came up with a formula to determine the story points completed in each sprint, and how many would need to be completed in each sprint to reach 'Done'. The team provided their input and feedback on which areas would require extra focus and which might be given a little lighter attention. We broke down those areas of focus and continued with Sprint 3.

I did not create a Gantt chart or attempt to establish Waterfall boundaries around the development and testing phases. The team was highly talented, self-organized, and onboard with what we were delivering. The sprints ran like clockwork and produced good, quality code that was well-tested. I monitored our velocity and our overall burndown in contrast with our backlog. Our risks were minimal and manageable. In mid-August, we were in excellent shape; there was no doubt that we would meet our goal of delivery by December.

A Sudden Turn in the Road

And that’s when it happened.

Three days prior, I left the office on a Friday afternoon, very comfortable with our progress and grateful to be working with such a talented team. Emails were circulating for a couple of weeks regarding routine maintenance that needed to be done to the Data Center over the weekend. While my team, and therefore my project, utilized the Data Center in a DevOps capacity by deploying code and conducting rounds of testing against it, I did not pay particular attention to the extent of what was happening over the weekend.

On Monday morning, when I walked into the office, I immediately sensed that something wasn’t right. Everyone, not just people from my project but all projects, along with folks from other lines of business, was at their keyboards or in impromptu meetings, with catastrophic error looks on their faces. I made my way to my desk, dropped my laptop into the docking station, and asked the couple of people in the team room what was going on. They told me that the Data Center was down.

“What do you mean?” I asked. “A server? A cluster? Part of the network?”

“No,” they replied. “The Data Center. All of it. ALL of the servers are not up.”

I still wasn’t sure that I had heard them correctly. “Are they being brought up?”

My team members looked at me again. “No. They’re not coming up. They’re dead.”

I wasn’t in front of a mirror, and therefore, couldn’t see the look on my face when that fact hit me. Maybe I blanched. Maybe my face went red. I don’t know. What I did know was that my project had just metaphorically hit an iceberg.

Through some conversations that took place over the next couple of hours, we all discovered that the servers in the Data Center had been scheduled for regular maintenance over the weekend, specifically to update the operating system, patch, and perform other maintenance tasks. Nothing was wildly erratic. And perhaps because of the regular nature of the weekend maintenance, some potentially helpful precautions were ignored, such as setting restore points, backing up files, creating safe zones, and the like.

Nothing SHOULD have happened. Unfortunately, something DID happen. To this day, I still don’t know the exact nature of the problem. Perhaps it was a fluctuation in power, or a hibernating virus or malware that was released. What I did know was that everything, including other projects, my project, and day-to-day operations, was out of commission for an unknown amount of time.

Regrouping and Responding Under Pressure

My team was already in our collaboration space when this news and more details filtered in. We needed to devise a plan for continuing our work despite this setback. We did not need to come up with an immediate answer that would carry us to the end of the project, but we did need to focus on what our next steps should be. As the PM, that was my job.

All of us still had our laptops available, including the development and testing teams. Additionally, the network infrastructure was not affected. For the time being, we can continue working with what we have available and making progress towards our overall project goal, which is to deliver the API to the client by the end of the year. Due to the effectiveness of our team, we have been able to develop ahead of schedule up to this point. Based on the estimations I had put together when I joined the project, we also had an idea of our position relative to the end of the year. As a result, to our credit, we had a little bit of time to react. And we knew exactly when the point of no return was, that being what we needed to have developed by the end of the sprint to achieve our goal.

Without the ability to perform automated testing with each sprint, we decided that manual testing on local environments would suffice, with the parallel creation of automation tests that could be utilized when the Data Center came back online. Additionally, each developer continued to create code of high quality, operating under the assumption that the code would be checked in at some point in the future.

The plan for my team was to create a backlog of development and testing while the Data Center was offline so that we would be ready to push forward immediately when the DevOps environment came back online. Considering that some time would be lost each sprint without this environment being available, we focused on mitigating that risk by maintaining our velocity in the future, so that when we could get back online, the negative impact would hopefully be minimal.

Rebuilding Alignment Across Teams

I also collaborated with the Project Managers on other projects to create a master calendar, which cataloged when we needed to access various environments in the Data Center for all projects. This was necessary as DevOps came back online for Performance Testing, Integration Testing, and other purposes.

Through all of this, there also needed to be constant, consistent, and honest communication with the stakeholders. Informal communication, such as watercooler chats or emails, occurred daily. On a bi-weekly basis, we demonstrated our progress and measured it against the end-of-year goal. I developed a Confidence Score that I would report against during these demos, which assessed whether we could deliver the end product quantitatively and reliably. The team never sugar-coated what was happening. If something went wrong, we reported on it. If we needed help, we asked for it.

After several weeks, we reported at one of our bi-weekly meetings that if the Data Center were not restored by the following week, we would be unable to meet the end-of-year delivery deadline. Many of the other projects had a similar status. When we returned to the office the following Monday, everything was back online.

At that point, the team and I worked together to prioritize what would be checked in and tested first, as we couldn’t push everything everywhere all at once. The schedule in the near term focused on testing and verification, while I worked with the other PMs to coordinate the master calendar, taking into account each project’s specific requirements. Could we potentially overlap with Performance Testing in some fashion? Did the entire development environment need to be cleaned, or could a portion remain static to save time?

At the end of the year, the team successfully delivered the API to the client. It was a high-quality product that everyone took pride in.

What Crisis Taught Me About Leadership

From this scenario, I learned to pay attention to what’s happening with the environments around me and my projects, even if I think they may not have an impact. I also learned about the importance of always having a time-relative scale of delivery for every project that I manage. Whether it’s a formal Gantt chart or something less formal, I always have a clear idea of what I’m delivering and when it needs to be delivered by.

What would I do differently next time? While we had timeline buffers, I would develop more specific contingency plans for infrastructure failures. This includes identifying alternative development and testing approaches before a crisis hits, not during it. Additionally, the master calendar approach proved effective in crisis mode. However, I would implement this type of resource coordination from the outset of projects to prevent conflicts and optimize the use of shared resources.

What still sticks with me about this project? What impressed me most was how quickly the team pivoted from shock to finding a solution. They didn't waste time on blame or despair. They immediately started problem-solving. This reinforced my belief that psychological safety and trust within a team are the most valuable assets a PM can cultivate.

Additionally, the bi-weekly confidence scoring and transparent reporting not only managed expectations but also built trust. When we said we'd deliver, stakeholders believed us because we had been honest about our challenges. This experience shaped how I approach all stakeholder communication.

I walked in on that August Monday expecting momentum. I left that year having earned trust through crisis.

Contributor Bio

Phillip is a Senior Project Manager with a proven track record of success across multiple industries. He brings over 20 years of experience in technology, with 15 years of specific experience in project management. Phillip has a strong background in healthcare, finance, retail, and insurance. He is highly skilled at leading and managing complex projects, with a proven ability to deliver results that drive value for clients and stakeholders.

Connect on LinkedIn: Phillip Henry

Want to Share Your Story or Insight?

We welcome contributors from across the project management world. If you have a lesson, experience, or perspective that can help others grow, we’d love to feature your post on The PM Playbook.

Contribute Now

We Launch On July 9

The PM Playbook is where real leaders connect, share hard lessons, and grow together. No fluff, no corporate speak, just bold and honest leadership in action.

Join The Waitlist

00

DAYS

00

HOURS

00

MINS

00

SECS