This spec covers how equality is compiled and executed by the F# compiler and library, based mainly on the types involved in the equality operation after all inlining, type specialization and other optimizations have been applied.
What do we mean by an equality operation?This spec is about the semantics and performance of the following coding constructs
a = b
a <> b
It is also about the semantics and performance of uses of the following FSharp.Core
constructs which, after inlining, generate code that contains an equality check at the specific EQTYPE
* HashIdentity.Structural<'T>
* {Array,Seq,List}.contains
* {Array,Seq,List}.countBy
* {Array,Seq,List}.groupBy
* {Array,Seq,List}.distinct
* {Array,Seq,List}.distinctBy
* {Array,Seq,List}.except
All of which have implied equality checks. Some of these operations are inlined, see below, which in turn affects the semantics and performance of the overall operation.
ER vs PER equalityIn math, a (binary) relation is a way to describe a relationship between the elements of sets. "Greater than" is a relation for numbers, "Subset of" is a relation for sets.
Here we talk about 3 particular relations: 1) Reflexivity - every element is related to itself - For integers, =
is reflexive (a = a
is always true) and >
is not (a > a
is never true) 2) Symmetry - if a
is related to b
, then b
is related to a
- For integers, =
is symmetric (a = b
-> b = a
) and >
is not (if a > b
then b > a
is false) 3) Transitivity - if a
is related to b
, and b
is related to c
, then a
is also related c
- For integers, >
is transitive (a > b
&& b > c
-> a > c
) and √
is not (a = √b
&& b = √c
doesn't mean a = √c
)
If a relation has 1, 2, and 3, we talk about Equivalence Relation (ER). If a relation only has 2 and 3, we talk about Partial Equivalence Relation (PER).
This matters in comparing floats since they include NaN. Depending on if we consider NaN = NaN
true or false, we talk about ER or PER comparison respectively.
The static type known to the F# compiler is crucial to determining the performance of the operation. The runtime type of the equality check is also significant in some situations.
Here we define the relevant static type EQTYPE
for the different constructs above:
a = b
: EQTYPE
is the statically known type of a
or b
a <> b
: EQTYPE
is the statically known type of a
or b
HashIdentity.Structural<'T>
, EQTYPE
is the inlined 'T
(results in specialized equality)Array.contains<'T>
, EQTYPE
is the inlined 'T
(results in specialized equality)List.contains<T>
likewiseSeq.contains<T>
likewiseThese only result in naked generic equality if themselves used from a non-inlined generic context.
Non-inlined constructs always resulting in naked generic equalityArray.groupBy<'Key, 'T> f array
, EQTYPE
is non-inlined 'Key
, results in naked generic equalityArray.countBy array
likewise for 'T
Array.distinct<'T> array
likewiseArray.distinctBy array
likewiseArray.except array
likewiseList.groupBy
likewiseList.countBy
likewiseList.distinct
likewiseList.distinctBy
likewiseList.except
likewiseSeq.groupBy
likewiseSeq.countBy
likewiseSeq.distinct
likewiseSeq.distinctBy
likewiseSeq.except
likewiseThese always result in naked generic equality checks.
Example 1:
let x = HashIdentity.Structural<byte> // EQTYPE known to compiler is `byte`
Example 2 (a non-inlined "naked" generic context):
let f2<'T> () =
... some long code
// EQTYPE known to the compiler is `'T`
// RUNTIME-EQTYPE known to the library is `byte`
let x = HashIdentity.Structural<'T>
... some long code
f2<byte>() // performance of this is determined by EQTYPE<'T> and RUNTIME-EQTYPE<byte>
Example 3 (an inlined generic context):
let f3<'T> () =
... some long code
// EQTYPE known to the compiler is `byte`
// RUNTIME-EQTYPE known to the library is `byte`
let x = HashIdentity.Structural<'T>
... some long code
f3<byte>() // performance of this is determined by EQTYPE<byte> and RUNTIME-EQTYPE<byte>
Example 4 (a generic struct type in a non-inline generic context):
let f4<'T> () =
... some long code
// EQTYPE known to the compiler is `SomeStructType<'T>`
// RUNTIME-EQTYPE known to the library is `SomeStructType<byte>`
let x = HashIdentity.Structural<SomeStructType<'T>>
... some long code
f4<byte>() // performance of this determined by EQTYPE<SomeStructType<'T>> and RUNTIME-EQTYPE<SomeStructType<byte>>
How we compile equality "a = b"
This very much depends on the EQTYPE
involved in the equality as known by the compiler
Aim here is to flesh these all out with: * Semantics: what semantics the user expects, and what the semantics actually is * Perf expectation: what perf the user expects * Compilation today: How we actually compile today * Perf today: What is the perf we achieve today * (Optional) sharplab.io link to how things are in whatever version is selected in sharplab * (Optional) notes
primitive integer types (int32
, int64
, ...)
let f (x: int) (y: int) = (x = y)
float32
, float64
)
let f (x: float32) (y: float32) = (x = y)
string
, decimal
String.Equals
or Decimal.op_Equality
call ✅GenericEqualityIntrinsic
IStructuralEqualityComparer
, boxes etc. ❌(Problem1)GenericEqualityIntrinsic
IStructuralEqualityComparer
etc. ❌(Problem2)IEquatable<T>
if present, but F# spec says call this.Equals(box that)
, in practice these are the sameGenericEqualityIntrinsic<SomeStructType>
FSharpEqualityComparer_PER`1<uint8[]>::get_EqualityComparer().Equals(...)
or FSharpEqualityComparer_PER`1<T[]>::get_EqualityComparer().Equals(...)
byte[]
Here "large" means the compiler-generated structural equality is NOT inlined.
Equals(T)
Equals(T)
has specialized code but boxes fields if struct or generic, see Problem3 ❌, Problem4 ❌Here "tiny" means the compiler-generated structural equality IS inlined.
FSharpEqualityComparer_ER`1<!a>::get_EqualityComparer().Equals(...)
'T
in non-inlined generic code
'T
actually isFSharpEqualityComparer_ER`1<!a>::get_EqualityComparer().Equals(...)
'T
in recursive position in structural comparison
This case happens in structural equality for tuple types and other structural types
'T
actually isFSharpEqualityComparer_ER`1<!a>::get_EqualityComparer().Equals(...)
EqualityComparer<'T>.Default
where possibleEqualityComparer.Default
- the change would lead to the usage of the custom implementation which is reasonableNote: this included changes to the optimizer to reduce GenericEqualityIntrinsic down to a type-indexed table lookup fetching an IEqualityComparer
and calling it. These hand-coded reductions appear unnecessary as the reduction doesn't open up any further optimizations.
val x: System.Collections.Generic.IEqualityComparer<byte>
module HashIdentity from Microsoft.FSharp.Collections
val Structural<'T (requires equality)> : System.Collections.Generic.IEqualityComparer<'T> (requires equality)
Multiple items
val byte: value: 'T -> byte (requires member op_Explicit)
--------------------
type byte = System.Byte
--------------------
type byte<'Measure> = byte
val f2<'T> : unit -> obj
'T
Multiple items
val int: value: 'T -> int (requires member op_Explicit)
--------------------
type int = int32
--------------------
type int<'Measure> = int
Multiple items
val float32: value: 'T -> float32 (requires member op_Explicit)
--------------------
type float32 = System.Single
--------------------
type float32<'Measure> = float32
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4