Visualization of AI transforming occupations: Microsoft Research report based on 200,000 user–AI conversations

What the Evidence Really Says About Generative AI, Work, and the Future of Occupations

Microsoft Research just released a groundbreaking report analyzing over 200,000 real user AI conversations and the findings reveal how generative AI is truly reshaping work and occupations.

Generative AI has moved from lab demo to daily tool. The central policy question is no longer whether it will touch work, but which work activities and by extension, which occupations are most affected first. Tomlinson, Jaffe, Wang, Counts, and Suri (Microsoft Research) offer one of the strongest empirical answers to date: they analyze 200,000 anonymized U.S. conversations with Microsoft Bing Copilot (now Microsoft Copilot) from January 1 to September 30, 2024, and map those real-world interactions to standardized work activities from O*NET. This report is not merely a projection of potential job losses; it is an empirical measurement of current usage patterns and their concrete implications for occupations today

Methodology

Intermediate Work Activities (IWAs)

  • The study maps conversations to Intermediate Work Activities (IWAs) from O*NET.
  • IWAs = generalizable work activities (332 in total), broader than occupation-specific tasks.
  • This mapping allows AI performance to be compared consistently across different jobs.

Key Dimensions & Metrics

CategoryWhat It CapturesHow It’s Measured
User GoalThe activity the user wanted to accomplishClassified from conversations
AI ActionWhat the AI actually producedClassified from AI output
SatisfactionWhether users found the AI’s response usefulThumbs up/down feedback
Task CompletionDid the AI complete the intended activity?LLM-based completion classifier
Scope of ImpactDid AI handle a small piece or a moderate share of the activity?LLM-based scope classifier

AI Applicability Score

  • Combines three factors:
    1. Coverage – how many of an occupation’s IWAs are represented in the Copilot data.
    2. Completion Rates – how successfully AI completes those IWAs.
    3. Scope of Impact – the breadth of AI’s contribution to the activity.
  • Produces a holistic measure of near-term overlap between AI capabilities and human work activities.

What Workers Ask For vs. What AI Actually Does

User goals (what people try to do): The most common are information gathering, writing, and communicating with others. These also score highest on satisfaction and completion users not only attempt these tasks with AI, they often succeed. 

AI actions (what the model actually performs): Copilot most often provides information and assistance, writes, and teaches/advises acting like a coach or tutor rather than a substitute project owner. The 40% disjoint rate between user goals and AI actions captures this service orientation: e.g., a user’s goal might be “gather market data,” while the AI’s action is “provide information to the user.” 

Depth vs. breadth. Impact scope is consistently lower on the AI-action side than on the user-goal side, indicating that AI can help with a broader fraction of someone’s work than it can perform directly again, an empirical marker of augmentation. Correlations in the appendix show that scope is less correlated with completion than satisfaction is (IWA-level r ≈ 0.45 and 0.22 respectively), and scope best predicts what people seek AI help for (r ≈ 0.64 with log user-goal activity share). Example: A user may want to “analyze a financial report and draft insights for a presentation.” AI might complete part of that goal by summarizing key figures or highlighting trends, but it typically will not generate the full business-ready presentation. This shows how AI contributes breadth (touching many tasks) but often with limited depth in execution. 

Occupational Applicability, With Data 

Based on applicability scores, AI shows the highest involvement in knowledge-heavy and communication-focused roles:

  • Interpreters and Translators – Coverage 98%, Completion 88%, Scope 57%, Overall Involvement Score 49% (Employment: 51,560)
  • Historians – 91%, 85%, 56%, 48% (3,040)
  • Passenger Attendants – 80%, 88%, 62%, 47% (20,190)
  • Sales Representatives of Services – 84%, 90%, 57%, 46% (1,142,020)
  • Writers and Authors – 85%, 84%, 60%, 45% (49,450)
  • Customer Service Representatives – 72%, 90%, 59%, 44% (2,858,710)

Other occupations with notable involvement include CNC Tool Programmers, Brokerage Clerks, Reporters/Journalists, Proofreaders, Editors, PR Specialists, and Mathematicians.

Occupations where AI is least involved

By contrast, AI plays little role in hands-on, physical, or in-person jobs, reflecting the study’s focus on LLMs (not robotics). Examples:

  • Phlebotomists – Involvement Score 3%
  • Nursing Assistants – 3%
  • Hazardous Materials Removal Workers – 3%
  • Pile Driver Operators, Water Treatment Plant Operators, Roofers, Dishwashers, Maids and Housekeeping Cleaners – similarly low scores.

Involvement by occupation groups

  • Highest involvement (employment-weighted):
    • Sales and Related – 32%
    • Computer and Mathematical – 30%
    • Office and Administrative Support – 29%
    • Community and Social Service; Arts, Design, Entertainment, Sports, and Media; Business and Financial Operations; Educational Instruction and Library – slightly lower but still high.
  • Education & wages: AI involvement is somewhat higher in jobs requiring a Bachelor’s degree (≈27%), and modestly higher for upper-middle wage jobs, but not confined to the very top earners.

Notable Implications from Occupational Involvement of AI

  1. AI as an Information Conduit: Reshaping Communication-Centric Work
    The strong alignment with occupations centered on information delivery (customer service, sales, broadcasting, hosting) highlights AI’s immediate role as a communication layer. This suggests that the first wave of workplace AI adoption is less about replacing decision-making and more about streamlining interactions. However, the implication is twofold: while efficiency gains are evident, the risk of commoditizing human communication skills grows, potentially undermining the value placed on empathy, persuasion, and cultural nuance that AI cannot yet replicate.
  2. The Transformation of Writing and Knowledge Work
    Writers, editors, and technical communicators appear as clear beneficiaries of AI augmentation, given the high rates of satisfaction and completion in tasks like drafting, revising, and summarizing. The implication here is that AI strengthens throughput rather than originality. This raises critical questions: will the professional writing market polarize between high-volume, AI-assisted commodity writing and niche, human-authored premium content? The evidence points to augmentation, but the longer-term concern is a shift in how expertise and creativity are valued.
  3. Friction in Analytical and Technical Roles
    Data science, web development, and mathematical fields show mixed results: while AI supports advisory or text-based tasks, it struggles with structured data manipulation and visual design. The implication is that AI’s role in technical occupations remains supplementary highlighting a gap between natural language fluency and domain-specific execution. Critically, this underscores the need for hybrid workflows that marry LLMs with specialized tools, rather than expecting standalone AI to deliver full value.
  4. Sales and Customer Operations: Augmentation of Advisory Functions
    High involvement in sales and customer operations reflects AI’s strong alignment with advising, explaining, and guiding customers. The implication is that AI augments human persuasion and relationship management rather than replacing it. Yet, a risk emerges: if organizations lean too heavily on AI-driven advisory, the human element of trust-buildingcentral to long-term sales success may erode. Thus, adoption here could deepen efficiency but weaken relational depth if unbalanced.
  5. Boundaries of AI Impact: Physical and Care Work
    The very low involvement in manual, caregiving, or machine-operation roles highlights a structural limitation: today’s LLMs have minimal impact on embodied labor. This suggests a digital divide in AI adoption: cognitive, communication-heavy jobs see rapid augmentation, while physical and care-intensive sectors remain insulated. However, this divide should not be read as permanent protection—future advances in robotics, vision, and multimodal AI could eventually bridge this gap, raising ethical and policy questions about how augmentation will extend into the most human-centric forms of work.

Augmentation vs. Automation: What the Metrics Actually Say

  • The 40% disjoint rate between user goals and AI actions is direct evidence that AI is commonly augmenting rather than replacing. Example: user seeks “resolve computer issues” (user goal IWA), AI performs “provide technical support” (AI-action IWA) as an instructional layer.
  • Scope is broader on user goals than AI actions, meaning workers can leverage AI across a wider share of their tasks than AI can autonomously perform a measured asymmetry consistent with human-in-the-loop patterns.
  • The authors explicitly note the analysis does not observe downstream business decisions (hiring, restructuring), so one cannot infer job losses or wage effects from these usage metrics alone. 

Policy & Education Responses (Aligned to the Evidence)

  1. Target information work. Since providing/communicating information and writing are the most exposed IWAs, prioritize training on prompt design, critical synthesis, source checking, and drafting workflows in sales, service, media, PR, and admin roles. This is where measured completion/scope are already high. 
  2. Codify human oversight. Given the augmentation signature (40% user–AI mismatch; lower AI-action scope), set policy guardrails for advice-giving AI: disclosure, audit trails of sources, escalation to human experts for consequential decisions.
  3. Credential rethink. With slightly higher exposure among Bachelor’s-required roles, invest in continuous upskilling for “knowledge communicators” (writers, editors, analysts, customer operations). Make evidence evaluation and explanatory writing core competencies.
  4. Data & design literacy. Where the report shows narrower scope (data analysis, visual design), develop hybrid curricula that join LLM prompting with spreadsheet/database fundamentals and visual reasoning bridging current weak spots.
  5. Measurement-first regulation. This paper’s activity-level lens should inform policy: regulate outcomes (accuracy, accountability) at the task level before sweeping occupation-level rules. Encourage secure telemetry for continued, privacy-preserving measurement. 

Conclusion

Tomlinson et al. provide a rigorous, usage-based snapshot of how workers and an LLM actually interact at scale and what that implies for occupational applicability today. The strongest, repeatedly measured signal is that AI is acting as an information and writing partner, a coach, aide, or explainer. Exposure concentrates where communicating information is central to the job; it is minimal for LLMs in hands-on physical roles. The authors’ careful separation of user goals and AI actions, and their construction of an AI applicability score, make this one of the most informative empirical baselines for policy, workforce planning, and curriculum design. It doesn’t tell us how firms will reorganize around these capabilities; it tells us, with evidence, where the capabilities fit right now and that is precisely what decision-makers need in 2025. 

Primary source: Tomlinson, Jaffe, Wang, Counts, Suri, “Working with AI: Measuring the Occupational Implications of Generative AI,” arXiv:2507.07935, v3, July 22, 2025. (Microsoft Research publication page also lists the preprint). (arXiv, microsoft.com)

FAQ

Q1: What is the Microsoft Research AI report about?

The report analyzes over 200,000 user–AI conversations to understand how generative AI supports or replaces human tasks across different occupations.

Q2: Which jobs are most affected by generative AI?

 The report finds AI is most involved in knowledge work like software development, data analysis, and writing, while being less involved in hands-on or interpersonal roles such as healthcare, teaching, and skilled trades.

Q3: What are the socioeconomic implications of AI’s impact on jobs?

 AI can act as a career accelerator for some while replacing others, raising concerns about digital inequality. For example, workers in outsourced customer support roles in emerging economies may face job losses if LLMs handle tasks more cheaply.