chore: [wip] PQC POC 2#13203
Conversation
TAG=agy CONV=0ade5891-3c8d-4e27-a240-b1a8cd6a0b0c
There was a problem hiding this comment.
Code Review
This pull request introduces Post-Quantum Cryptography (PQC) support across the Java SDK by integrating Bouncy Castle and configuring PQC-hardened SSL contexts for gRPC and HTTP/JSON transports. Key changes include the addition of a pqc-test module, reflection-based configuration for Netty channel builders, and updates to OAuth2Utils and InstantiatingHttpJsonChannelProvider. Feedback identifies issues with potential initialization errors in OAuth2Utils, the use of a SNAPSHOT dependency version, and performance overhead from repeated reflection lookups. Suggestions were also provided for improving exception handling, ensuring logic consistency in transport creation, and cleaning up test provider registrations.
…ranch chore/pqc-poc-2 TAG=agy CONV=0ade5891-3c8d-4e27-a240-b1a8cd6a0b0c
TAG=agy CONV=0ade5891-3c8d-4e27-a240-b1a8cd6a0b0c
This commit addresses the review feedback by implementing: 1. Graceful fallback to default NetHttpTransport with WARNING logging in OAuth2Utils static initializer. 2. warning-level exception logging in InstantiatingGrpcChannelProvider. 3. Caching of shaded/unshaded gRPC Netty OpenSSL reflection lookups inside thread-safe static OpenSslReflectionHolder to remove runtime overhead. 4. Reverting createHttpTransport in InstantiatingHttpJsonChannelProvider to return null when mTLS is not active, as default transport is already PQC-hardened. 5. Clean unregistration of BC and BCJSSE security providers in integration test server/client teardown. TAG=agy
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements Post-Quantum Cryptography (PQC) support by hardening HTTP and gRPC transports with Bouncy Castle JSSE. Key changes include updating OAuth2Utils to use a PQC-hardened transport, adding Bouncy Castle dependencies to GAX modules, and introducing reflection-based configuration for gRPC channels to support hybrid PQC key exchange. A new integration test module, pqc-test, is also introduced. Feedback highlights the use of a -SNAPSHOT dependency version in the parent POM, code duplication and brittle string-based class checks in the gRPC reflection logic, and unused imports.
| <javax.annotation-api.version>1.3.2</javax.annotation-api.version> | ||
| <grpc.version>1.81.0</grpc.version> | ||
| <google.http-client.version>2.1.0</google.http-client.version> | ||
| <google.http-client.version>2.1.1-SNAPSHOT</google.http-client.version> |
There was a problem hiding this comment.
Updating google.http-client.version to a -SNAPSHOT version in a parent POM is generally discouraged for release-track projects. This can lead to unstable builds and dependency resolution issues in environments without access to the specific snapshot repository. If this is necessary for the POC, please ensure it is reverted or replaced with a stable version before merging.
| private ManagedChannelBuilder<?> applyPqcConfiguration(ManagedChannelBuilder<?> builder) { | ||
| // Configure the PQ and classical hybrid named groups: | ||
| // 1. X25519MLKEM768 (codepoint 4588): Hybrid classical (X25519) + post-quantum (ML-KEM-768) key exchange. | ||
| // Provides defense-in-depth: if ML-KEM is compromised, security reverts to classical strength of X25519. | ||
| // 2. MLKEM768 (codepoint 1896): Pure post-quantum key exchange using ML-KEM-768. | ||
| // 3. X25519 (codepoint 29): Classical elliptic curve Diffie-Hellman key exchange, used as a fallback. | ||
| String[] hybridGroups = new String[] {"X25519MLKEM768", "MLKEM768", "X25519"}; | ||
| String builderClassName = builder.getClass().getName(); | ||
| boolean isShaded = "io.grpc.netty.shaded.io.grpc.netty.NettyChannelBuilder".equals(builderClassName); | ||
| boolean isUnshaded = "io.grpc.netty.NettyChannelBuilder".equals(builderClassName); | ||
|
|
||
| if (isShaded && OpenSslReflectionHolder.SHADED_AVAILABLE) { | ||
| try { | ||
| Object sslContextBuilder = OpenSslReflectionHolder.SHADED_FOR_CLIENT.invoke(null); | ||
| OpenSslReflectionHolder.SHADED_OPTION_METHOD.invoke( | ||
| sslContextBuilder, OpenSslReflectionHolder.SHADED_GROUPS_OPTION, (Object) hybridGroups); | ||
| Object sslContext = OpenSslReflectionHolder.SHADED_BUILD_METHOD.invoke(sslContextBuilder); | ||
| OpenSslReflectionHolder.SHADED_SSL_CONTEXT_METHOD.invoke(builder, sslContext); | ||
| return builder; | ||
| } catch (java.lang.reflect.InvocationTargetException | IllegalAccessException | RuntimeException e) { | ||
| LOG.log(Level.WARNING, "Failed to configure shaded PQC transport fallback", e); | ||
| } | ||
| } else if (isUnshaded && OpenSslReflectionHolder.UNSHADED_AVAILABLE) { | ||
| try { | ||
| Object sslContextBuilder = OpenSslReflectionHolder.UNSHADED_FOR_CLIENT.invoke(null); | ||
| OpenSslReflectionHolder.UNSHADED_OPTION_METHOD.invoke( | ||
| sslContextBuilder, OpenSslReflectionHolder.UNSHADED_GROUPS_OPTION, (Object) hybridGroups); | ||
| Object sslContext = OpenSslReflectionHolder.UNSHADED_BUILD_METHOD.invoke(sslContextBuilder); | ||
| OpenSslReflectionHolder.UNSHADED_SSL_CONTEXT_METHOD.invoke(builder, sslContext); | ||
| return builder; | ||
| } catch (java.lang.reflect.InvocationTargetException | IllegalAccessException | RuntimeException e) { | ||
| LOG.log(Level.WARNING, "Failed to configure unshaded PQC transport fallback", e); | ||
| } | ||
| } | ||
| return builder; | ||
| } |
There was a problem hiding this comment.
The applyPqcConfiguration method contains significant code duplication between the shaded and unshaded Netty configuration paths. Additionally, checking the builder class name via string comparison (lines 938-939) is brittle and may fail if the class is renamed or wrapped.
Consider refactoring OpenSslReflectionHolder to store two instances of a helper class (e.g., NettyReflectionMetadata) that encapsulates the reflected methods and classes for each variant. You can then use Class.isInstance() for a more robust type check and a single helper method to apply the configuration.
References
- If code is duplicated and needs to be shared, move it to a separate helper/utility class.
| import javax.net.ssl.SSLContext; | ||
| import java.security.NoSuchAlgorithmException; |
| ks.load(PqcTestServer.class.getResourceAsStream("/pqctest.p12"), "password".toCharArray()); | ||
|
|
||
| // Build a custom HttpTransport explicitly trusting the self-signed certificate keystore. | ||
| final com.google.api.client.http.HttpTransport httpTransport = new com.google.api.client.http.javanet.NetHttpTransport.Builder() |
There was a problem hiding this comment.
this should be enabled by default via pom.xml in this test
…t delegating wrapper
…uncy Castle JSSE in PQC zero-config connectivity integration tests and add extensive documentation comments - Configures Bouncy Castle server-scoped system properties fallback to enforce ML-KEM-768 on Java 17. - Keeps compile-safe Java 20 reflection for JRE 20+ runtimes. - Adds extremely detailed comments describing provider, keystore, managers, server configurators, and netty gRPC secure socket pipelines. TAG=agy CONV=5d96c302-27fd-438a-ad0e-ffd6d64e61cb
This reverts commit f0e9e46.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new integration test suite for Post-Quantum Cryptography (PQC) auto-upgrades, featuring a common test module, a specialized PqcTestServer that enforces ML-KEM-768, and specific runners for snapshot and release versions. Key feedback includes addressing the missing registration of Bouncy Castle providers in the setup and ensuring global system properties are cleared in the teardown to prevent side effects. Other recommendations focus on removing duplicate code and documentation, increasing server startup timeouts for CI reliability, and cleaning up redundant methods or stale comments.
| } catch (ClassNotFoundException e) { | ||
| isPqcSupported = false; | ||
| } | ||
|
|
There was a problem hiding this comment.
The Bouncy Castle providers are not being registered in the setup method, despite the Javadoc and the teardown method suggesting they should be. Without registering BouncyCastleJsseProvider at position 1, the client will not use PQC for the TLS handshake as intended by this test suite.
if (isPqcSupported) {
Security.insertProviderAt(new org.bouncycastle.jsse.provider.BouncyCastleJsseProvider(), 1);
Security.insertProviderAt(new org.bouncycastle.jce.provider.BouncyCastleProvider(), 2);
}| public static void teardown() { | ||
| if (serverProcess != null) { | ||
| // Forcibly destroy the background process and close standard streams to allow | ||
| // clean exit | ||
| serverProcess.destroyForcibly(); | ||
| } | ||
| if (isPqcSupported) { | ||
| Security.removeProvider("BCJSSE"); | ||
| Security.removeProvider("BC"); | ||
| } | ||
| } |
There was a problem hiding this comment.
The system properties set in setup (javax.net.ssl.trustStore, etc.) are not cleared in teardown. This can lead to side effects in other tests running in the same JVM. Ensure all global state is restored and that the cleanup is exception-safe to prevent resource leaks.
public static void teardown() {
if (serverProcess != null) {
// Forcibly destroy the background process and close standard streams to allow
// clean exit
serverProcess.destroyForcibly();
}
if (isPqcSupported) {
Security.removeProvider("BCJSSE");
Security.removeProvider("BC");
}
System.clearProperty("javax.net.ssl.trustStore");
System.clearProperty("javax.net.ssl.trustStorePassword");
System.clearProperty("javax.net.ssl.trustStoreType");
}References
- When managing a collection of closeable resources, ensure they are closed in the reverse order of their creation (LIFO). The implementation must be exception-safe to prevent resource leaks.
| /** | ||
| * Configures the integration test harness environment before test cases are executed. | ||
| * | ||
| * <p><b>Detailed Security & Keystore Configuration Architecture:</b> | ||
| * | ||
| * <ul> | ||
| * <li><b>What is a Keystore (PKCS12):</b> A PKCS12 keystore (<code>pqctest.p12</code>) is a | ||
| * secure key database containing the server's private key and its self-signed public | ||
| * certificate. The server uses it during the TLS handshake to prove its identity and | ||
| * establish a secure channel. | ||
| * <li><b>How Encryption Works:</b> The certificate itself does not encrypt message data | ||
| * directly. Instead, during the TLS handshake, the client and server negotiate a symmetric | ||
| * session key using post-quantum cryptography (ML-KEM). This session key is then used to | ||
| * encrypt and decrypt all sent/received HTTP/gRPC data. | ||
| * <li><b>Why a Custom Temporary Truststore is Required:</b> Because the server uses a | ||
| * self-signed test certificate, it is not signed by any public Certificate Authority (CA) | ||
| * trusted by the standard JRE truststore (<code>cacerts</code>). Without registering a | ||
| * custom truststore containing this certificate, standard JRE TLS clients will reject the | ||
| * connection with an <code>SSLHandshakeException</code>. We extract the certificate to a | ||
| * temporary file and point <code>javax.net.ssl.trustStore</code> to it, thereby trusting it | ||
| * scope-specifically for this test run without polluting or mutating the user's system-wide | ||
| * JRE truststore. | ||
| * <li><b>JCA Provider Registration:</b> Registers <code>BouncyCastleJsseProvider</code> at | ||
| * provider position 1. This registers Bouncy Castle as the primary security provider, | ||
| * causing all standard default <code>SSLContext</code> and vanilla client factories to | ||
| * utilize Bouncy Castle JSSE and negotiate PQC automatically. | ||
| * </ul> | ||
| */ |
|
|
||
| // Ephemeral port detection timeout (10 seconds) to fail-fast on server startup | ||
| // errors | ||
| if (System.currentTimeMillis() - startTime > 10000) { |
There was a problem hiding this comment.
A 10-second timeout for the server process to start and print its ports might be too short for some CI environments. Consider increasing this to at least 30 seconds to avoid flaky tests.
| if (System.currentTimeMillis() - startTime > 10000) { | |
| if (System.currentTimeMillis() - startTime > 30000) { |
| public void runTests() throws Exception { | ||
| assertEquals(isPqcSupported, clientSupportsPqc()); | ||
| testHttpPqc(); | ||
| testGrpcPqc(); | ||
| testBigQueryPqc(); | ||
| } |
| // Enforce ALWAYS and ONLY hybrid ML-KEM / Kyber named groups programmatically on | ||
| // HttpsServer! |
There was a problem hiding this comment.
This comment refers to a stale implementation detail regarding PQC enforcement that is actually handled in PqcEnforcingSSLEngine. Per repository rules, comments referring to stale implementations should be removed rather than updated.
References
- Remove comments that refer to stale implementations instead of updating them.
| private static class ByteMarshaller implements io.grpc.MethodDescriptor.Marshaller<byte[]> { | ||
| @Override | ||
| public InputStream stream(byte[] value) { | ||
| return new java.io.ByteArrayInputStream(value); | ||
| } | ||
|
|
||
| @Override | ||
| public byte[] parse(InputStream stream) { | ||
| try { | ||
| return com.google.common.io.ByteStreams.toByteArray(stream); | ||
| } catch (java.io.IOException e) { | ||
| throw new RuntimeException(e); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The ByteMarshaller class is duplicated from PqcConnectivityTest. Consider moving it to a shared utility class within the pqc-test-common module to avoid duplication and maintain proper layering.
References
- Avoid inverting layering by making lower-level channel implementations depend on higher-level RPC wrappers. If code is duplicated and needs to be shared, move it to a separate helper/utility class.
| try { | ||
| while (System.in.read() != -1) { | ||
| Thread.sleep(1000); | ||
| } |
…ate in pqc-tests workflow
No description provided.