We have released a new API for people who write custom CodeQL queries which make use of dataflow analysis. The new API offers additional flexibility, improvements that prevent common pitfalls with the old API, and improves query evaluation performance by 5%. Whether you’re writing CodeQL queries for personal interest, or are participating in the bounty programme to help us secure the world’s code: this post will help you move from the old API to the new one.
This API change is relevant only for users who write their own custom CodeQL queries. Code scanning users who use GitHub’s standard CodeQL query suites will not need to make any changes.
With the introduction of the new dataflow API, the old API will be deprecated. The old API will continue to work until December 2024; the CodeQL CLI will start emitting deprecation warnings in December 2023.
To demonstrate how to update CodeQL queries from the old to the new API, consider this example query which uses the soon-to-be-deprecated API:
class SensitiveLoggerConfiguration extends TaintTracking::Configuration {
SensitiveLoggerConfiguration() { this = "SensitiveLoggerConfiguration" } // 6: characteristic predicate with dummy string value (see below)
override predicate isSource(DataFlow::Node source) { source.asExpr() instanceof CredentialExpr }
override predicate isSink(DataFlow::Node sink) { sinkNode(sink, "log-injection") }
override predicate isSanitizer(DataFlow::Node sanitizer) {
sanitizer.asExpr() instanceof LiveLiteral or
sanitizer.getType() instanceof PrimitiveType or
sanitizer.getType() instanceof BoxedType or
sanitizer.getType() instanceof NumberType or
sanitizer.getType() instanceof TypeType
}
override predicate isSanitizerIn(DataFlow::Node node) { this.isSource(node) }
}
import DataFlow::PathGraph
from SensitiveLoggerConfiguration cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink.getNode(), source, sink, "This $@ is written to a log file.",
source.getNode(),
"potentially sensitive information"
To convert the query to the new API:
module
instead of a class
. A CodeQL module
does not extend
anything, it instead implements
a signature. For both data flow and taint tracking configurations this is DataFlow::ConfigSig
or DataFlow::StateConfigSig
if FlowState
is needed.DataFlow::Configuration
or TaintTracking::Configuration
. Instead, now you define your data or taint flow by instantiating either the DataFlow::Global<..>
or TaintTracking::Global<..>
parameterized modules with your implementation of the shared signature and this is where the choice between data flow and taint tracking is made.override
anything, because you are defining a module.isBarrier
and it applies to both taint tracking and data flow configurations. You must use isBarrier
instead of isSanitizer
and isBarrierIn
instead of isSanitizerIn
.isAdditionalTaintStep
you use isAdditionalFlowStep
.DataFlow::PathGraph
. Instead, the PathGraph
will be imported directly from the module you are using. For example, SensitiveLoggerFlow::PathGraph
in the updated version of the example query below.PathNode
type from the resulting module and not from DataFlow
.from
and where
clauses. Instead of using e.g. cfg.hasFlowPath
or cfg.hasFlow
from a configuration object cfg
, you’ll use flowPath
or flow
from the module you’re working with.Taking all of the above changes into account, here’s what the updated query looks like:
module SensitiveLoggerConfig implements DataFlow::ConfigSig { // 1: module always implements DataFlow::ConfigSig or DataFlow::StateConfigSig
predicate isSource(DataFlow::Node source) { source.asExpr() instanceof CredentialExpr } // 3: no need to specify 'override'
predicate isSink(DataFlow::Node sink) { sinkNode(sink, "log-injection") }
predicate isBarrier(DataFlow::Node sanitizer) { // 4: 'isBarrier' replaces 'isSanitizer'
sanitizer.asExpr() instanceof LiveLiteral or
sanitizer.getType() instanceof PrimitiveType or
sanitizer.getType() instanceof BoxedType or
sanitizer.getType() instanceof NumberType or
sanitizer.getType() instanceof TypeType
}
predicate isBarrierIn(DataFlow::Node node) { isSource(node) } // 4: isBarrierIn instead of isSanitizerIn
}
module SensitiveLoggerFlow = TaintTracking::Global<SensitiveLoggerConfig>; // 2: TaintTracking selected
import SensitiveLoggerFlow::PathGraph // 7: the PathGraph specific to the module you are using
from SensitiveLoggerFlow::PathNode source, SensitiveLoggerFlow::PathNode sink // 8 & 9: using the module directly
where SensitiveLoggerFlow::flowPath(source, sink) // 9: using the flowPath from the module
select sink.getNode(), source, sink, "This $@ is written to a log file.", source.getNode(),
"potentially sensitive information"
While not covered in this example, you can also implement the DataFlow::StateConfigSig
signature if flow-state is needed. You then instantiate DataFlow::GlobalWithState
or TaintTracking::GlobalWithState
with your implementation of that signature. Another change specific to flow-state is that instead of using DataFlow::FlowState
, you now define a FlowState class
as a member of the module. This is useful for using types other than string
as the state (e.g. integers, booleans). An example of this implementation can be found here.
This functionality is available with CodeQL version 2.13.0
. If you would like to get started with writing your own custom CodeQL queries, follow these instructions to get started with the CodeQL CLI and the VS Code extension.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4