Skip to content

C++: Model secure versions of scanf as flow sources#21856

Open
MathiasVP wants to merge 5 commits into
github:mainfrom
MathiasVP:scanf-safe-functions
Open

C++: Model secure versions of scanf as flow sources#21856
MathiasVP wants to merge 5 commits into
github:mainfrom
MathiasVP:scanf-safe-functions

Conversation

@MathiasVP
Copy link
Copy Markdown
Contributor

@MathiasVP MathiasVP commented May 15, 2026

Does what it says on the tin.

There are a couple of small details that need to be fixed in order to model these as flow sources: since every string-like output buffer is succeeded by a buffer size argument we cannot blindly mark all variadic argument as an output buffer.

To fix this I've added a new version of hasLocalFlowSource and hasRemoteFlowSource with a Call column. Using these new predicates the scanf_s variants can specify which output arguments of that call are sources of flow instead of marking all of them.

Commit-by-commit review recommended.

@MathiasVP MathiasVP force-pushed the scanf-safe-functions branch from 4d76e8d to 3ea3420 Compare May 18, 2026 12:25
@MathiasVP MathiasVP force-pushed the scanf-safe-functions branch from c977469 to 19781e5 Compare May 18, 2026 13:21
@MathiasVP MathiasVP marked this pull request as ready for review May 18, 2026 18:48
@MathiasVP MathiasVP requested a review from a team as a code owner May 18, 2026 18:48
Copilot AI review requested due to automatic review settings May 18, 2026 18:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds modelling of the secure scanf_s/fscanf_s/sscanf_s/... variants as local/remote flow sources. Because every string-buffer vararg in these functions is followed by a buffer-size argument, the existing "all varargs are sources" approach would incorrectly mark size integers as sources. To address this, the PR extends RemoteFlowSourceFunction::hasRemoteFlowSource and LocalFlowSourceFunction::hasLocalFlowSource with an additional Call column so a model can express that the set of source arguments depends on the specific call, and updates ScanfFunctionCall.getOutputArgument to skip size arguments for S variants.

Changes:

  • Recognise the scanf_s/fscanf_s/sscanf_s/swscanf_s/_*_s_l family in Scanf.qll and add an isSVariant predicate plus a size-argument detector so getOutputArgument skips buffer-size varargs.
  • Add new 3-argument hasLocalFlowSource(Call, FunctionOutput, string) / hasRemoteFlowSource(Call, FunctionOutput, string) predicates with default bodies delegating to/from the existing 2-argument versions, and migrate ScanfModel/FscanfModel plus consumers (CleartextTransmission.ql, ExternalAPIsSpecific.qll, FlowSources.qll) to the new API.
  • Extend scanf library tests and the source/sink dataflow tests, and add a change-note documenting the new models and API.
Show a summary per file
File Description
cpp/ql/lib/semmle/code/cpp/commons/Scanf.qll Adds S-variant detection, scanf_s/fscanf_s/sscanf_s/swscanf_s/_*_s_l recognition, and updates getOutputArgument to skip size arguments for S variants
cpp/ql/lib/semmle/code/cpp/models/implementations/Scanf.qll Switches ScanfModel/FscanfModel to the new call-aware hasLocalFlowSource/hasRemoteFlowSource and factors logic into a shared hasFlowSource helper
cpp/ql/lib/semmle/code/cpp/models/interfaces/FlowSource.qll Introduces 3-arg hasLocalFlowSource/hasRemoteFlowSource predicates with a Call column and default bodies bridging the old API
cpp/ql/lib/semmle/code/cpp/security/FlowSources.qll Uses the new call-aware predicates in RemoteModelSource and LocalModelSource
cpp/ql/src/Security/CWE/CWE-311/CleartextTransmission.ql Uses the new call-aware hasRemoteFlowSource overload
cpp/ql/src/Security/CWE/CWE-020/ExternalAPIsSpecific.qll Simplifies the remote-source predicate using the new call-aware API
cpp/ql/test/library-tests/scanf/test.c Adds a scanf_s call and a second integer variable to exercise S-variant handling
cpp/ql/test/library-tests/scanf/scanfFunctionCallOutput.ql / .expected New test query and expected output verifying getOutputArgument results
cpp/ql/test/library-tests/scanf/scanfFunctionCall.expected / scanfFormatLiteral.expected Updated baselines reflecting scanf_s modelling
cpp/ql/test/library-tests/dataflow/source-sink-tests/sources-and-sinks.cpp Adds inline expectations for scanf_s/fscanf_s as local/remote sources
cpp/ql/lib/change-notes/2026-05-15-secure-scanf.md Change-note describing the new models and the additional Call column on the flow-source predicates

Copilot's findings

  • Files reviewed: 13/13 changed files
  • Comments generated: 2

Comment thread cpp/ql/lib/semmle/code/cpp/commons/Scanf.qll
Comment thread cpp/ql/lib/semmle/code/cpp/commons/Scanf.qll
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants