2. Building Lilli at the speed of change

ChatGPT and similar tools were gaining rapid attention in early 2023. McKinsey wanted to give users a fully secure tool as soon as possible to provide colleagues with a safe way to tap into the power of large language models (LLMs). Time—and security—was of the essence, so McKinsey moved quickly to mobilize the Lilli effort.

Proof of concept

Roadmap and operating model

Development decisions

Build, test, iterate

Rollout

Creating the proof of concept 
March 2023, Duration 1 week

It all began with a small team that built a proof of concept for Lilli in just one week. That lean build and a five-minute presentation provided leadership with more than enough evidence to support investment in developing a full-blown gen AI platform.

Powered by Ceros

McKinsey opened Lilli to colleagues gradually over the course of three months, with the platform becoming available to all employees by October 2023. The Lilli development team continued to add new features and improve the performance of existing ones, guided by an unwavering focus on quantitative and qualitative user feedback. New capabilities are always tested rigorously with alpha and beta users to assess performance against carefully selected metrics, such as user interaction depth and output quality, before release to the entire firm.

of McKinsey employees have access to Lilli

100%

The team developed the Lilli MVP in roughly five weeks and provided access to a set of 200 alpha users, enabling nearly a month of testing and iteration before gradually offering Lilli to the entire firm.

+32%

+52%

improvement in answer quality during alpha testing

1x

Week 1 to Week 2

Week 2 to Week 3

Alpha testing and user feedback enabled rapid improvements

Development decisions for the gen AI stack were guided by a five-point framework that evaluated cost, scalability, performance, security, and timing. When it came to determining whether to adopt a taker, shaper, or maker strategy for Lilli’s underlying LLMs, the best answer turned out to be shaper with a dash of maker. Lilli leverages a prebuilt model hosted by a hyperscaler, and the team trained five of its own smaller expert models to understand user intent and improve the relevancy of Lilli’s answers.

Security

Performance

Scalability

Timing

Cost

Initial firm launch (45,000 users)

MVP 
(5,000 users)

Beta 
(500 users)

Lilli’s development decisions were based on five criteria

A more robust central team coalesced to plan the approach before work began on a minimal viable product (MVP). After aligning on a North Star vision, the team established, collected, and prioritized use cases. Through workshops and interviews, they developed use case profiles that helped assess the value, impact, feasibility, and technology requirements for each, which enabled leadership to quickly align on those to tackle first. Cross-functional agile squads, with a 1:1 ratio of nontechnical to technical professionals, were then stood up for core areas such as technical development, safety and security, user analytics, and adoption. Kitti and other leaders guided the entire Lilli platform experience and orchestrated the teams to deliver it.

Prioritized list of business-led use cases and initial users

Use case profile and initial tech requirements

Prioritized roadmap of use cases

McKinsey followed a three-step approach to establishing Lilli’s use case roadmap

1

2

It all began with a small team that built a proof of concept for Lilli in just one week. That lean build and a five-minute presentation provided leadership with more than enough evidence to support investment in developing a full-blown gen AI platform.

Proof of concept

Roadmap and operating model

Development decisions

Rollout

Build, test, iterate

ChatGPT and similar tools were gaining rapid attention in early 2023. McKinsey wanted to give users a fully secure tool as soon as possible to provide colleagues with a safe way to tap into the power of large language models (LLMs). Time—and security—was of the essence, so McKinsey moved quickly to mobilize the Lilli effort.

2. Building Lilli at the speed of change

Rolling out Lilli

Building, testing, and iterating

Making key development decisions 
May 2023, Duration 2 weeks

Establishing the roadmap and operating model

Ideation
Stakeholder workshops, user interviews, best practice research

Impact and feasibility assessments Value and impact sizing, additional stakeholder and user interviews, discussions with technology leaders

Creating the proof of concept 
March 2023, Duration 1 week