Author: adaptiman

Crowdstrike Outage “Not What You Thought”
It’s been six months since the Crowdstrike outage – enough time to reflect on the incident and take stock. I had lunch with my CISO about a week after the outage. It was the first time we had seen each other in several weeks. “So,” I asked sheepishly, “how have you been since the outage?” “I’ve been fine. But the Service Desk has been swamped. Since my security team wasn’t that busy, we pitched in to help remediate the outage. They touched 15,000 servers and client machines in three days.” I inquired further. His role focused on the management of encryption keys that were necessary to unlock and manually patch the operating systems of the affected machines. “The hard part of the recovery was managing the keys,” he said. As his team was jointly responsible for the security of those keys, that was the extent of his involvement. You see, Crowdstrike pushed a bad patch – one file – but an important one that loads at the kernel level. This caused all of those Windows machines to “blue screen.”

Something didn’t compute. I thought he was going to be falling asleep at the table, eyes bloodshot, bags under them, a quart jug of coffee in his hand. Instead, he seemed rather chipper. Then it hit me. This wasn’t a security incident. Rather, it’s what we call in ITSM a deployment and release management issue. It’s not that Security Management wasn’t involved, they were. But it was apparent early in the Problem cycle that this wasn’t a cyberattack.

The response from our university IT was quick and appropriate. Within thirty seconds of the patches being applied, customers began to call and report “blue screens.” This spawned a number of related incidents at the Service Desk. These incidents were quickly correlated into a Problem record, which was upgraded to a major incident (i.e., outage) record in less than an hour, all of this happening around midnight on July 19th. During the early morning hours, an incident response team did a root cause analysis and quickly determined the problem was a vendor patch.

The vendor response was quick and the patch was available by early morning, although the CEO of Crowdstrike was criticized in subsequent days for not issuing a timely apology. The damage to Crowdstrike’s reputation was done. After all, the outage affected roughly 8.5 million computers. Crowdstrike was quickly seen as the responsible party and IT folks around the world became heroes as the outage response progressed. But Microsoft was also responsible for letting Crowdstrike play in the Windows kernel. Microsoft distanced themselves from responsibility by asserting, “Although this was not a Microsoft incident, given it impacts our ecosystem, we want to provide an update on the steps we’ve taken with CrowdStrike and others to remediate and support our customers.” In this instance, Microsoft was acting as an integrator, more specifically, as a Service Guardian, where they managed both a third-party vendor (Crowdstrike) and provided services (Windows). In this instance, ITIL best-practices dictate that we have a high-level of communication and trust with the integrator, but also acknowledge that our customers will hold us – not our vendors – responsible. After all, who are our customers going to blame – us or our vendor?

I see a double failure here. Crowdstrike failed by deploying a service with a critical bug in it, which they should’ve uncovered in their acceptance testing. This is not George Kurtz’s first high-visibility failure. In 2010, he was CEO of McAfee when a similar outage occurred. The second failure was Microsoft’s mismanagement of their vendor. One may ask why they allowed a vendor to deploy a file at the kernel level without sufficient testing. You would also expect Microsoft to have caught the error prior to approving the release of the errant file. Was Microsoft’s trust of Crowdstrike so great that they didn’t do acceptance testing and simply passed the updates through? If so, they need to review their Deployment and Release Management practices. Of course, this is pure speculation.

Meanwhile, back at “the ranch,” the IRT created a Change Request that included testing of the patch on a number of machines. Procedures to apply the patch were documented at both the individual asset level and the more strategic coordination level. On the communication side, customer communication began as soon at the Problem was identified, about an hour into the incident, with a number of communications happening in the early morning hours via IT staff in the colleges and university communications to stakeholders. Communication continued through the next few days as the incidents were remediated and non-reported servers and endpoints patched. An After Action Review was conducted less than a week after the initial incident was reported. Lessons learned were documented. DONE!! YAY!!!

Since I retired from IT, I’m an “observer” these days and I can tell you that I don’t miss the excitement surrounding outages. Been there, done that, got the t-shirt. But I must say that I’m very proud of the way our university handled this major incident – responsive, professional, by the book. I don’t think our response would’ve been as good five years ago. We’ve come a long way in our journey in understanding ITSM.

In summary, what ITSM practice areas were involved in this outage?
1. Service Desk
2. Incident Management
3. Problem Management
4. Continuity Management (via Major Incident/Outage)
5. Vendor Management
6. Asset Management
7. Relationship Management (i.e., communication with stakeholders)
8. Change Management
9. Security Management (indirectly)
This is a pretty impressive slice of the ITIL ITSM Practices for a single issue. I think our IT folks would report that we have varying levels of maturity in each of the Practice areas, but I can tell you from experience that this kind of outage hones our skills to respond better the next time. Iron sharpens iron.
December 20, 2024
The Provenance of a Dram

Nine and a half today – ugh.
Stiff legs,
Cold hexagonal tile,
Glow belt,
Running shoes – a little worn,
Earbuds,
Running watch,
Smartphone.

Queue ’em up:
Sing the Hours,
Daily Wire,
First Up,
Daily Poem,
Megyn Kelly,
Dan Bongino.
Long enough? Yes.
Push play.

Brisk, still, fall air,
Dogs wet noses,
Tail of death.
Start slow.
Breathe in through the nose
And out through the mouth.
Up the pace.

“Welcome to the Daily Poem…
Quinquireme…
Cedar, cinnamon, and sandalwood…”

Running rhythm becomes poetic pulse.
Time ceases to hurt my lungs.

“Emeralds and amethysts…
Gold moidores and sweet white wine.”

Sweet white wine? – Blech!
David, Sean, and Bethany need
A midwinter taste of Texas with a stamp!

Sharp fall radishes,
Brown eggs laid down sparingly,
Gloriously dead ragweed stalks,
Whispering post oak snow,
Family gatherings.

Love.

Friends for dinner,
Pork loin,
Mashed sweet potatoes with butter,
Brussels sprouts with bacon,
Strangely empty rocks glasses.
Brisk fall sales,
Sharing the Bread,
Growing the podcast audience.

Try a drop of this –
Kooper’s cooperage,
Oaky honey,
Crème Brûlée,
Apple, cinnamon, lingering citrus –
Oh My!

December 20, 2024

Bedtime Acclamation

"I'm getting married on my head,"

My sweet John Robert to me said.

"Perhaps my Dear

Should wait a year.

But now it's time to go to bed."

November 10, 2024

Lost Improvements: An Analogy to Defects

Defects are not free. Somebody makes them, and gets paid for making them.
W. Edwards Deming

To summarize Deming’s teaching on defects, they cost an organization thrice. First, the defect is made, which robs the organization of a “working” product or service. Second, the defect must be identified, which also takes time and resources. Lastly, the defect must be resolved, thus taking more resources away from producing non-defective products and services. If this isn’t bad enough, these costs don’t include opportunity costs which could be mitigated with improvements.

In manufacturing (and IT ;-)), a defect happens because of a quality failure either at the source or somewhere upstream. Once a defect is built into a product, there are two ways to detect it. First, it may be detected prior to shipping. Second, the customer may see the defect, which is significantly worse from a CX perspective. To draw the analogy to lost improvements, if there is no system in place to record improvements, that’s the equivalent of allowing a defect to get to the customer. Lack of improvement causes more technical debt and operational overhead down the line and will be reflected in much of the work that is done by the organization. These defects will be visible to customers, one way or another. How does an organization create a culture of continual improvement?

First, an organization must embrace a culture of improvement. According to ITIL4, a culture of improvement requires three things; transparency, managing by example, and building trust (CDS, 2.3.4, 2.3.8). I’ll treat these three topics in more detail in a future post, but suffice it to say that my perspective is that the former are dependent on the latter – that is, trust is the “coin of the realm” and other aspects of an improvement culture are dependent on it. For example, organizations that have a high degree of trust manifest a corresponding high level of transparency.

Trust is the “coin of the realm” and other aspects of an improvement culture are dependent on it.

Second, an organization must provide mechanisms for conserving, prioritizing, and executing improvement initiatives. Starting with a Continual Improvement Register (CIR) is a good first step. If systems are too proscribed, or improvement processes not defined, team members don’t feel empowered (or able) to record improvement ideas. Without improvement, the organization will continue to produce defects. Making the CIR accessible at all levels of the organization is also recommended. Appointing a small, dedicated improvement person or team responsible for prioritizing and executing on those improvement opportunities closes the loop. Communicating the status of improvement opportunities creates buy-in from the organization and keeps the suggestions rolling in. In my experience, organizations go awry in the second requirement. They may build a culture of trust and improvement, but that culture must be operationalized to realize the true benefits.

August 5, 2024
The Non-Technical Economy
It seems that everything these days is about AI and how the world as we know it will end. Some are prophesying that entire swaths of our economy will be replaced by AI. Writers are lining up on both sides of the argument. As IT professionals, much has been written (or assumed) about the use of AI in IT.

In 2019, Brian Merchant wrote, “A robot is not ‘coming for’, or ‘stealing’ or ‘killing’ or ‘threatening’ to take away your job. Management is.” Is that true?

While most people are focusing on those things that AI can do to replace human work, there are a significant number of things that AI can’t do better than humans. Interestingly, I asked AI (i.e., MS CoPilot) for a list of things that humans do better. Here’s the list:
1. Emotional Intelligence: Humans are naturally equipped with the ability to understand, interpret, and respond to emotions in a way that AI cannot. This includes empathy, compassion, and the ability to read subtle social cues, which are essential for meaningful human interactions.
2. Creativity and Innovation: While AI can generate new content based on existing data, humans have the innate ability to think outside the box, imagine completely new concepts, and create innovative solutions to problems.
3. Ethical and Moral Judgment: Humans can navigate complex ethical dilemmas and make decisions based on a nuanced understanding of morality, societal norms, and personal values. AI lacks the consciousness to make such judgments.
4. Adaptability and Flexibility: Humans can easily adapt to new situations, learn from diverse experiences, and apply knowledge in various contexts. AI systems, on the other hand, can struggle with tasks that deviate from their training data or require adaptation to new environments.
5. Complex Problem-Solving: Humans are capable of solving poorly defined or novel problems that require intuition, judgment, and the ability to find connections between seemingly unrelated information. AI systems are typically limited to solving algorithmic problems and can struggle with tasks that cannot be systematically defined.
6. Physical Dexterity and Mobility: Humans have fine motor skills and the ability to navigate complex physical environments, which are challenging for AI and robotics to replicate with the same level of proficiency.
7. Interpersonal Skills: The ability to build relationships, collaborate effectively, and navigate social networks is a distinctly human trait. AI can assist with communication but cannot fully replicate the depth of human interaction.
8. Learning from Limited Data: Humans can learn new skills or understand new concepts from a few examples, whereas AI often requires large datasets to learn effectively.
9. Understanding Context and Nuance: Humans excel at understanding context, sarcasm, irony, and nuanced language, which can be challenging for AI to interpret correctly.
It’s important to recognize that AI is a tool designed to augment human abilities, not replace them. The collaboration between human intelligence and AI has the potential to enhance productivity and innovation across various fields.

What’s interesting about this list is that most of these skills are closely related to those needed to provide excellent IT service management. As the emphasis in IT has grown over the last three decades from technical to customer-service competencies, the identification of these soft skills has been one of the ways the profession has defined and delineated itself. Take, for example, the list of skills necessary to provide excellent service desk support (ITIL4 Foundation Training, 2024):
- Customer service
- Empathy
- Incident analysis and prioritization
- Effective communication
- Emotional Intelligence
It would appear, at least at this moment in time, that AI will not be able to do some of the fundamental things we do in IT service management. Indeed, a survey of those industries most susceptible to “takeover” by AI include manufacturing, finance, healthcare, cybersecurity, and education. Note that these fields don’t rely heavily on stakeholder interactions to be effective.

So why are “managers” still trying to replace us? I think the answer is that they are thinking in a binary way – either we use AI to do work or we use humans. The real answer is that AI will augment and complement humans in IT service management, not replace them. The collaboration between human intelligence and AI has the potential to enhance productivity and innovation across various fields. This is reflected in the newest ITIL4 Create, Deliver, Support curriculum which stresses the effective integration of AI, among other tools. Mature IT Managers will realize that AI is a tool that can automate steps of the value stream, but at the end of the day, customers will have better outcomes and realize more value if humans are left to do what humans do best.
July 30, 2024
The Appropriation Paradox

My father joined the army fresh out of high school. After training as a “sigint” operator, his second tour took him to Munich, Germany. He married my mother shortly before moving to Germany and they settled in the small town of Bad Aibling, close to the army base and nestled under the gaze of the Wendelstein in the Bavarian Alps. I was born 11 months later in a Munich army hospital.

Although I don’t remember much from that first year of life, I do remember family references to my birthplace and still have a pair of lederhosen they bought me there. I don’t have a drop of German blood in me, but have always loved German things. In high school, I took German for four years and was a proud member of our high school German club. I learned all about German culture including German folk dancing which carried over to college. That’s where I met my wife. My wedding present from her was a Lladro statuette of polka dancers.

What’s my point? The reason I’ve embraced German culture is because I love it, respect it, study it, share it, and advocate for it. So yesterday’s story in the UK Daily Mail about a University of Houston Latina sorority who “culturally appropriated” black step dancing doesn’t make any sense. No one adopts a cultural practice that they don’t like. On the contrary, adopting cultural practices is a way to show your love and respect for it by making it your own. I don’t see anything wrong with that.

The main argument against cultural appropriation is that if you are not a [fill in the blank: race/gender/culture/religion] person, then you can’t do something that that a [fill in the blank: race/gender/culture/religion] person does. Let me get this straight: if I’m black, I can step. If I’m white, stepping is cultural appropriation. So the difference in these two scenarios is my color. Hmm.

June 19, 2024
ITIL 4 and Aggregation Theory

Back in the days of ITILv3, focusing on process was the right thing to do at the time. Building out robust, documented, repeatable processes went a long way toward consistent service delivery, and for many years, this approach to service management was enough. Then in the late two-thousands, significant changes in availability of IT service suppliers and the flattening of service delivery created a situation in which our customers, who had historically been a “captive” audience, now had choices. They quickly learned that we weren’t the only game in town. They had choices from outside the organization. Enter shadow IT. Were we still relevant to our customers? If our role wasn’t service provision, what was it?

When ITIL 4 came around, the framework transitioned from an internal process-heavy focus to an external, customer-centered focus. At the time, the shift toward customer value “felt” right, but I couldn’t put my finger on the reason why. For a number of years, I had noticed that our customers were reasonably happy with the services we provided. But when we started engaging them strategically with BRM (Business Relationship Management) by fostering a relationship in order to understand their business and what they really valued, their happiness increased significantly. This practice worked in a big way, but why?

Today, I made a connection between the outsized results we reaped with BRM and Aggregation Theory. The basic idea of aggregation theory is that value chains have three different groups: suppliers, distributors, and consumers/users¹. Before the Internet disrupted everything, distribution was expensive. Take the example of newspapers. Newspapers had to be physically distributed. Competitive advantage was gained by the distributors (e.g. New York Times, The Washing Post, etc.) integrating the suppliers (i.e., journalists). The reason this worked was because customers outnumbered suppliers. A distributor that integrated supplier relationships had a significant advantage over distributors that didn’t. This was integration up the value stream.

Post-Internet 2.0, the cost of the customer transaction decreased to practically zero as distribution became aggregated. Using our example, newspapers moved to digital editions and the cost of distribution decreased. But along with lowering customer transaction costs came de-personalization of the relationship. I missed the sight of my paperboy meandering down the street on his bike only to toss my paper in the bushes. In the new era, customers became weary of thousands of scattershot email solicitations, the rampant buying and selling of their information, and the always annoying automated feedback requests.

“You’ve been chosen as one of our special customers to give us feedback today. For your time and effort, you’ll be eligible to receive a totally worthless coupon that you can’t redeem unless you stand on your head, pat your belly, and cough three times.”

Customers actually missed drop-in visits from support team members, calls from their sales reps, and conversations with the engineering teams. The ubiquity of low-value customer connections had increased the value of the personal relationship. And it wasn’t just the relationship, it was the nature of what we did for them. While we continued to provide IT services (if not all), our role had to shift to that of a strategic partner. We had to grieve that we would no longer have the exclusive affections of our customers and accept that they had become poly-amorous, so to speak.

This is why the focus on value and relationship has taken center stage today. Successful organizations will be those that provide the best user experience. This means an increased focus on customer relationships and a careful curation of customer experiences – integrating customers down the value stream. It means continuously understanding what the customer really values. It means getting out and talking to our customers, and I don’t mean our robots talking to their robots. I mean WE have to talk with THEM.

¹Incidentally, ITIL 4 simplifies this model by describing two top-level roles: providers and consumers, and then extending the concept to the three-part model by stating that organizations are both consumers and providers. ITIL 4 focuses on the relationships between organizations in the service relationship model.

June 17, 2024
Home for the Holidays

Don’t you love spending quality time with family and friends eating and drinking lots of things you don’t need? You gotta love those conversations around the dinner table and if you’re anything like me, you’re thankful that Christmas comes but once a year. I had a number of interesting conversations this holiday, but I’m only going to share one of them.

This conversation happened at the dinner table and was initiated by one of my “in-laws.” I won’t say which one to protect the guilty. Now that I think of it, it wasn’t much of a conversation – more like a diatribe. “Higher education is a complete waste of time and money. All they do is indoctrinate your kids and teach them a bunch of radical ideology. Nobody should go to college. Anything you need to know, you can teach yourself for free. You just have to love learning, that’s the key.” It was an interesting topic to broach when you’re sitting next to two college professors.

Except for that last part, you can guess that I disagreed with just about all of it. But it was unsettling on a number of levels. For one, this type of argument against higher education used to be the exclusive purview of liberal-minded folks, but I’m hearing it more and more frequently from conservative types such as those at the Daily Wire. Matt Walsh is especially venomous in his attacks of “liberal college professors.”

While there is some truth in the “irrelevance” argument against higher education, I believe that we have a lot to offer students, if we can remember what higher education is really about. But poor leadership, a decline in the classic liberal arts education, and a rash of institutions in the news lately for behaving badly have fueled the fire. Higher ed is not making a good value proposition anymore. Steeply rising costs mainly as a result of falling state and federal support over the last 35 years coupled with the explosion of the administrative university have hindered our ability to provide value. We’ve allowed others to define the goal of higher education solely as skill development. This is why for-profit, online, nimble educational corporations are beating us, at least for the moment.

So what is education really about? I believe that colleges should re-focus on teaching the classic liberal arts education which, by definition, develops a student’s intellectual and moral character, rather than simply teaching them a set of skills. This type of education is designed to provide students with a broad understanding of the world and its history, as well as to teach them how to think critically and communicate effectively. It includes subjects such as literature, philosophy, history, and the fine arts.

And this is important: we need to educate students in these things in addition to teaching them skills that are useful to employers. After all, skills help us succeed in the workplace, but virtue helps us succeed in life.

What makes me sad is that our colleges seem to have lost this vision. Colleges of liberal arts are under attack and being cut at every turn. While part of this result is self-inflicted, it seems that not many college professors are interested in mentoring our students to pursue truth and virtue.

Higher education needs to rediscover what made it great in the first place.

January 4, 2024