Skip to content

D3DKMTEscape universally rejected (STATUS_BUFFER_TOO_SMALL / EOVERFLOW) for Intel NPU compute adapter — NPU enumerable but not driveable #40445

@Aristide021

Description

@Aristide021

Summary

On Windows 11 with an Intel Core Ultra NPU (8086:7D1D), the NPU is correctly enumerated as a compute adapter inside WSL2 via dxgkrnl/libdxcore.so, and adapter/device handles open successfully. However, every D3DKMTEscape call with a non-empty payload returns STATUS_BUFFER_TOO_SMALL (0xC0000023) unconditionally — regardless of escape type, flags, buffer size, or content. The Linux-side kernel log confirms this as dxgkio_escape: Ioctl failed: -75 (EOVERFLOW), the errno translation of STATUS_BUFFER_TOO_SMALL.

This is not the same as the general GPU/NPU feature question in #13292. The NPU is enumerable from WSL2. The DirectML, oneAPI, and Mesa paths mentioned in that thread apply only to GPU workloads — there is no DirectML/OpenCL path for the NPU. The only compute path for the NPU requires D3DKMTEscape for process registration in Intel's driver, and that path is broken at the WSL2 host proxy layer.

Intel has confirmed (intel/linux-npu-driver#56) that this gating is on Microsoft's side.


Environment

  • Host: Windows 11 Home (build 10.0.26100.8246)
  • CPU: Intel Core Ultra 7 155H (Meteor Lake)
  • NPU: Intel(R) AI Boost, PCI VEN_8086:DEV_7D1D
  • WSL version: 2.6.3.0
  • WSL2 kernel: 6.6.87.2-1-microsoft-standard-WSL2
  • WSLg version: 1.0.71
  • DXCore version: 10.0.26100.1-240331-1435.ge-release
  • Direct3D version: 1.611.1-81528511
  • Distro: Ubuntu 24.04.1 LTS
  • libdxcore: /usr/lib/wsl/lib/libdxcore.so

wsl --version

WSL version: 2.6.3.0
Kernel version: 6.6.87.2-1
WSLg version: 1.0.71
MSRDC version: 1.2.6353
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26100.1-240331-1435.ge-release
Windows version: 10.0.26100.8246

dmesg | grep -i dxg (escape failures)

[    0.511441] hv_vmbus: registering driver dxgkrnl
[    3.193279] misc dxg: dxgk: dxgkio_is_feature_enabled: Ioctl failed: -22
[    3.195910] misc dxg: dxgk: dxgkio_query_adapter_info: Ioctl failed: -22
[ 3845.965329] misc dxg: dxgk: dxgkio_escape: Ioctl failed: -75
[ 3845.966207] misc dxg: dxgk: dxgkio_escape: Ioctl failed: -75
[ 3845.966974] misc dxg: dxgk: dxgkio_escape: Ioctl failed: -75
[ 3845.967480] misc dxg: dxgk: dxgkio_escape: Ioctl failed: -75
[ 3845.967943] misc dxg: dxgk: dxgkio_escape: Ioctl failed: -75
[ 3845.968474] misc dxg: dxgk: dxgkio_escape: Ioctl failed: -75
[ 3845.969008] misc dxg: dxgk: dxgkio_escape: Ioctl failed: -75
[ 3845.969440] misc dxg: dxgk: dxgkio_escape: Ioctl failed: -75
[ 3845.969994] misc dxg: dxgk: dxgkio_escape: Ioctl failed: -75
[ 3845.971006] misc dxg: dxgk: dxgkio_escape: Ioctl failed: -75

errno -75 is EOVERFLOW, the Linux translation of STATUS_BUFFER_TOO_SMALL (0xC0000023) through the dxgkrnl proxy layer.


What works

  • D3DKMTEnumAdapters3 (COMPUTE_ONLY filter) returns the NPU with correct vendor/device IDs (8086:7D1D)
  • D3DKMTOpenAdapterFromLuid succeeds
  • D3DKMTCreateDevice succeeds
  • D3DKMTEscape with PrivateDriverDataSize == 0 returns STATUS_SUCCESS

What fails

D3DKMTEscape with any non-empty payload returns STATUS_BUFFER_TOO_SMALL regardless of:

  • Escape Type (tested 0–14)
  • Flags (tested 0, 0x1, 0x8, 0xff, 0x1000, and combinations)
  • Buffer sizes 1–65536 bytes
  • Payload content — including verbatim byte-exact replay of a 568-byte init escape captured via Frida from a successful Windows-side OpenVINO NPU run

Minimal reproduction

import ctypes
from ctypes import POINTER, Structure, c_uint32, c_uint64, c_void_p, byref

class LUID(Structure):
    _fields_ = [('LowPart', c_uint32), ('HighPart', c_uint32)]
class D3DKMT_ADAPTERINFO(Structure):
    _fields_ = [('hAdapter', c_uint32), ('AdapterLuid', LUID),
                ('NumOfSources', c_uint32), ('bPresentMoveRegionsPreferred', c_uint32)]
class D3DKMT_ENUMADAPTERS3(Structure):
    _fields_ = [('Filter', c_uint64), ('NumAdapters', c_uint32), ('_pad', c_uint32),
                ('pAdapters', POINTER(D3DKMT_ADAPTERINFO))]
class D3DKMT_OPENADAPTERFROMLUID(Structure):
    _fields_ = [('AdapterLuid', LUID), ('hAdapter', c_uint32)]
class D3DKMT_ESCAPE(Structure):
    _fields_ = [('hAdapter', c_uint32), ('hDevice', c_uint32),
                ('Type', c_uint32), ('Flags', c_uint32),
                ('pPrivateDriverData', c_void_p), ('PrivateDriverDataSize', c_uint32),
                ('hContext', c_uint32)]

lib = ctypes.CDLL('/usr/lib/wsl/lib/libdxcore.so')

# Enumerate compute adapters — succeeds, NPU appears as VEN_8086:DEV_7D1D
infos = (D3DKMT_ADAPTERINFO * 16)()
ea = D3DKMT_ENUMADAPTERS3()
ea.Filter = 1; ea.NumAdapters = 16
ea.pAdapters = ctypes.cast(infos, POINTER(D3DKMT_ADAPTERINFO))
assert lib.D3DKMTEnumAdapters3(byref(ea)) == 0

npu_luid = infos[0].AdapterLuid  # select the NPU adapter LUID

# Open adapter — succeeds
oa = D3DKMT_OPENADAPTERFROMLUID()
oa.AdapterLuid = npu_luid
assert lib.D3DKMTOpenAdapterFromLuid(byref(oa)) == 0

# Escape with empty payload — succeeds
e = D3DKMT_ESCAPE(); e.hAdapter = oa.hAdapter; e.PrivateDriverDataSize = 0
assert lib.D3DKMTEscape(byref(e)) == 0

# Escape with any non-empty payload — BUG
buf = ctypes.create_string_buffer(b'\x00' * 64)
e.pPrivateDriverData = ctypes.cast(buf, c_void_p)
e.PrivateDriverDataSize = 64
status = lib.D3DKMTEscape(byref(e)) & 0xFFFFFFFF
print(hex(status))  # 0xc0000023 — STATUS_BUFFER_TOO_SMALL

Root cause analysis

Static disassembly of npu_kmd.sys confirms STATUS_BUFFER_TOO_SMALL is not returned by Intel's VpuEscape dispatcher — it only appears as a comparison target. The rejection originates in Microsoft's host-side dxgkrnl proxy before reaching the vendor KMD.

The Linux-side dxgvmb_send_escape only short-circuits for payloads exceeding DXG_MAX_VM_BUS_PACKET_SIZE; empty payloads succeed, confirming the VM bus channel is functional. The block is unconditional on the Windows host side for non-empty WSL2-originated escapes to compute partition adapters.


Impact

Intel's NPU UMD issues D3DKMTEscape for process registration before the first CreateContextVirtual (escapes #1–4 in the call sequence, confirmed via Frida trace). Without these succeeding, no per-process state exists in npu_kmd.sys, so every subsequent primitive fails with INVALID_PARAMETER.

Result: OpenVINO inside WSL2 reports available_devices = ['CPU'] even though the NPU is enumerable at the D3DKMT layer.


Ask

  1. A documented allowlist of permitted DRIVERPRIVATE escape patterns for compute-only partition adapters, OR
  2. A .wslconfig opt-in for vendor escape passthrough on compute-only adapters, OR
  3. Surface the NPU via /dev/accel/accel0 so the in-tree Linux IVPU driver (CONFIG_DRM_ACCEL_IVPU) can claim it directly — this would also resolve intel/linux-npu-driver#56.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions