Microsoft and OpenAI Investigate DeepSeek's Alleged Data Mi…

Published At: Jan. 30, 2025, 9:19 p.m.

Microsoft and OpenAI Investigate Possible Misuse of AI Data by DeepSeek

Microsoft Corporation, along with OpenAI, is actively exploring the possibility that a Chinese AI startup, DeepSeek, may have inappropriately acquired data generated by OpenAI’s technology. This investigation stems from observations by Microsoft’s security researchers, who detected in the fall that individuals possibly linked to DeepSeek were exfiltrating significant amounts of data through the OpenAI API.

Unauthorised Data Access Concerns

These activities, made known by individuals who prefer to remain anonymous due to the confidentiality of the situation, raise potential violations of OpenAI’s terms of service. The specific actions observed suggest potential attempts to bypass limitations set by OpenAI regarding data access volumes via the API.

As a primary partner and considerable investor in OpenAI, Microsoft promptly informed its ally of these suspicious occurrences. Neither OpenAI nor Microsoft, however, have provided comments on these developments.

DeepSeek's Market Disruption

The matter gained further complexity with DeepSeek's recent unveiling of an innovative AI model named R1. This model boasts features capable of human-like reasoning, posing a considerable challenge to the current leaders in the AI industry, such as OpenAI, Google, and Meta Platforms Inc. DeepSeek's claim that R1 equals or even surpasses leading U.S. developers' applications on several critical benchmarks has notably impacted technology stocks, including those of Microsoft and Nvidia, with a noticeable reduction in their market value reaching nearly $1 trillion.

Despite the turbulent impact initially, technology shares gradually rebounded. Notably, Nvidia experienced a 9% rise by Tuesday, despite a 2.7% dip at Wednesday’s market opening. Similarly, Microsoft reported a 3% increase. Meanwhile, ASML Holding NV, a renowned chip-machine manufacturer, marked its most significant intraday gain in four years after surpassing earnings expectations.

Allegations of Model Distillation

Adding to the controversy, David Sacks, a high-profile figure in the AI arena, voiced concerns regarding DeepSeek’s usage of OpenAI model outputs. In a public discussion, Sacks referenced a process known as distillation, where one AI model is trained using another’s outputs to replicate similar functions. OpenAI admitted ongoing evaluations of DeepSeek’s potential inappropriate use of their models, specifically through distillation techniques.

DeepSeek, on the other hand, insists on adhering to open-source frameworks when distilling its R1 models. This contrasts with OpenAI’s more closed systems, while several models from Meta, such as Llama, remain open-source and accessible for public utilization.

OpenAI has expressed awareness and is thoroughly reviewing the alleged distillation practices by DeepSeek and is committed to sharing further information as it becomes available.

Published At: Jan. 30, 2025, 9:19 p.m.

Original Source: Microsoft Probing If DeepSeek-Linked Group Improperly Obtained OpenAI Data (Author: Dina Bass and Shirin Ghaffary)
Note: This publication was rewritten using AI. The content was based on the original source linked above.

← Back to News