Skip to content

Speed up uuid.UUID string parsing #150226

@esadomer

Description

@esadomer

uuid.UUID.__init__() normalizes every string input by removing urn:, uuid:, braces, and hyphens, even for documented inputs where only one of those operations is needed.

The common documented forms are:

  • 12345678123456781234567812345678
  • 12345678-1234-5678-1234-567812345678
  • {12345678-1234-5678-1234-567812345678}
  • urn:uuid:12345678-1234-5678-1234-567812345678

A small refactor can handle the documented urn:uuid: prefix directly with str.removeprefix() and keep the existing broader normalization as a fallback for legacy accepted inputs.

On a local Windows debug build, a same-process benchmark comparing origin/main with the patched Lib/uuid.py showed improvements for the documented forms:

plain_hex:  11584.5 ns -> 11366.8 ns (1.9%)
canonical:  12424.6 ns -> 12253.1 ns (1.4%)
braced:     13293.5 ns -> 12981.1 ns (2.3%)
urn:        13800.0 ns -> 12912.8 ns (6.4%)

I also verified test_uuid passes on the same debug build.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytype-featureA feature request or enhancement
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions