Key Takeaway

The most common management mistake with AI teams is applying pure product delivery metrics to work that requires experimentation, creating pressure that leads to skipped evaluation and production incidents.

How AI Teams Are Different

Managing AI teams requires adapting standard engineering management practices in several key ways. The work is inherently more experimental: a significant percentage of ML experiments will produce negative results, and that is normal and expected. Feedback loops are longer: model training, evaluation, and iteration cycles take days or weeks rather than hours. Skill profiles are more diverse: AI teams need researchers, ML engineers, data engineers, and MLOps specialists who have different working styles, career expectations, and evaluation criteria. Understanding these differences is prerequisite to managing an AI team effectively.

Hiring the Right Team

The first hiring decision is the team's skill mix. Most teams need more ML engineers (who bridge research and production) than pure researchers or pure data scientists. The ideal early team has ML engineers who can train models AND deploy them to production, supplemented by data engineers who can build reliable data pipelines. Hire pure researchers only when your problem requires novel model architectures or approaches, not when existing models need to be adapted and deployed.

Role	Focus	When to Hire	Interview Signal
ML Engineer	Model development + production deployment	First AI hire; core of every AI team	Can discuss both model architecture and serving infrastructure
Data Engineer	Data pipelines, quality, feature engineering	When data preparation becomes a bottleneck	Experience with data quality frameworks and pipeline orchestration
MLOps Engineer	ML infrastructure, CI/CD, monitoring	When you have 2+ models in production	Experience with model deployment, monitoring, and automated retraining
Applied Researcher	Novel model development, evaluation methodology	When existing models do not meet quality bar	Can explain why a standard approach fails and propose alternatives
Data Scientist	Analysis, experimentation, insight generation	When business needs exploratory analysis alongside ML	Strong statistical foundation and communication skills

Goal Setting for AI Work

Traditional OKRs map poorly to AI work because outcomes are uncertain. A team can do excellent work on a well-designed experiment and still produce a negative result. Adapt goal-setting by separating delivery goals from learning goals. Delivery goals are standard: ship feature X with Y quality by date Z. Learning goals acknowledge uncertainty: run experiment X to test hypothesis Y, report results by date Z regardless of outcome. Both types of goals are first-class contributions.

1
Delivery OKRs (50-60% of goals)
Apply to work with known approaches and predictable outcomes: deploying an existing model to a new use case, building infrastructure, improving monitoring, or integrating with a well-understood API. These follow standard engineering goal-setting practices.
2
Learning OKRs (25-35% of goals)
Apply to experimental work: testing a new model architecture, evaluating a new approach, or running an A/B test. The key result is the learning produced, not the outcome. 'Complete evaluation of approach X and document findings' is a valid key result.
3
Platform OKRs (15-20% of goals)
Apply to infrastructure and tooling work: improving the ML platform, reducing model deployment time, adding monitoring capabilities. These are standard engineering goals but should be tracked separately to ensure the team invests in platform health.

Performance Evaluation

Evaluating AI practitioners requires adjusting your calibration framework. An ML engineer who runs three well-designed experiments that all produce negative results has contributed valuable knowledge to the organization. If you only reward positive experimental outcomes, you incentivize skipping evaluation and shipping models that are not ready. Evaluate based on: the rigor of the experimental approach, the quality of the analysis, the clarity of the documentation, and the contribution to organizational learning. Positive business outcomes are a team metric, not an individual performance metric for experimental work.

Sprint Management

AI sprint planning requires accommodations that standard product sprints do not. Explicitly allocate capacity for experimentation (20-30% of sprint capacity). Accept that some work items will span multiple sprints (training runs, large-scale evaluations). Create a process for handling 'blocked on data' situations, which are far more common in AI teams than in product engineering teams. Use the AI Sprint Planning Agenda template to structure sprint planning sessions that balance feature delivery with exploration.

Stakeholder Management

AI work creates stakeholder management challenges that product engineering does not. Timelines are less predictable, outcomes are uncertain, and progress is harder to demonstrate. Address this by: setting expectations about experimental uncertainty upfront, providing regular progress updates that include negative results (framed as valuable learning), using demos strategically (only demo when the system is representative of final quality, not before), and maintaining a backlog of 'safe wins' that can be delivered when experimental work stalls.

Never demo an AI system that is not representative of production quality. Early demos of impressive but cherry-picked results set expectations that the team cannot meet, leading to stakeholder disappointment and erosion of trust. Wait until the system consistently performs at a level you are comfortable showing to a skeptical audience.

Career Development

AI practitioners need career paths that value both research and engineering contributions. An ML engineer who builds production-grade infrastructure should be able to advance as fast as one who publishes research. Define career levels that account for both tracks: the engineering track emphasizes production systems, reliability, and team impact; the research track emphasizes novel approaches, evaluation methodology, and knowledge contribution. Allow movement between tracks as interests evolve.

0/8 completed

Version History

1.0.0 · 2026-03-01

• Initial guide for managing AI teams

Key Takeaway

How AI Teams Are Different

Hiring the Right Team

Role	Focus	When to Hire	Interview Signal
ML Engineer	Model development + production deployment	First AI hire; core of every AI team	Can discuss both model architecture and serving infrastructure
Data Engineer	Data pipelines, quality, feature engineering	When data preparation becomes a bottleneck	Experience with data quality frameworks and pipeline orchestration
MLOps Engineer	ML infrastructure, CI/CD, monitoring	When you have 2+ models in production	Experience with model deployment, monitoring, and automated retraining
Applied Researcher	Novel model development, evaluation methodology	When existing models do not meet quality bar	Can explain why a standard approach fails and propose alternatives
Data Scientist	Analysis, experimentation, insight generation	When business needs exploratory analysis alongside ML	Strong statistical foundation and communication skills

Goal Setting for AI Work

1
Delivery OKRs (50-60% of goals)
Apply to work with known approaches and predictable outcomes: deploying an existing model to a new use case, building infrastructure, improving monitoring, or integrating with a well-understood API. These follow standard engineering goal-setting practices.
2
Learning OKRs (25-35% of goals)
Apply to experimental work: testing a new model architecture, evaluating a new approach, or running an A/B test. The key result is the learning produced, not the outcome. 'Complete evaluation of approach X and document findings' is a valid key result.
3
Platform OKRs (15-20% of goals)
Apply to infrastructure and tooling work: improving the ML platform, reducing model deployment time, adding monitoring capabilities. These are standard engineering goals but should be tracked separately to ensure the team invests in platform health.

Performance Evaluation

Sprint Management

Stakeholder Management

Career Development

0/8 completed

Version History

1.0.0 · 2026-03-01

• Initial guide for managing AI teams

Managing AI Teams

How AI Teams Are Different

Hiring the Right Team

Goal Setting for AI Work

Delivery OKRs (50-60% of goals)

Learning OKRs (25-35% of goals)

Platform OKRs (15-20% of goals)

Performance Evaluation

Sprint Management

Stakeholder Management

Career Development

Version History

Related content

Managing AI Teams

How AI Teams Are Different

Hiring the Right Team

Goal Setting for AI Work

Delivery OKRs (50-60% of goals)

Learning OKRs (25-35% of goals)

Platform OKRs (15-20% of goals)

Performance Evaluation

Sprint Management

Stakeholder Management

Career Development

Version History

Related content