RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/dotnet/fsharp/issues/11976 below:

Modernizing F# Analysis · Issue #11976 · dotnet/fsharp · GitHub

@dsyme and I sat down to document my overall plan for modernizing the way F# and FCS do analysis (Don did most of the writing here). Much of this work is already done, this documents the plan end-to-end. Here's what we came up with, please comment and discuss below.

See also #7077 for a previous description.

Planning: Modernizing F# Analysis

This note describes the technical agenda to "modernize" the FCS analysis services to use best-known techniques from Roslyn.

Executive summary

The core of the plan is to adopt a more Roslyn-like model of analysis, based on

Immutable snapshots of the contents of documents and projects
Immutable views of their enrichment with analysis information
A cacheless compiler-service API

In the long term this agenda delivers multiple critical benefits:

High-performance multi-threaded analysis
A more reliable basis for implementing multiple IDE features, including cross-file refactorings and analysis
Alows features "in-memory documents" and "in-memory cross-project references to C#", simplifying the user experience of using F# in Visual Studio.
It aligns F# with the architectural principles of Roslyn, allowing contributors to transfer experience between the two
Looking forward, gives a strong basis for reliably make F# analysis more incremental w.r.t. incremental changes in inputs.
Looking forward, gives a strong basis for reliably building a simple, reliable "out-of-proc" LSP implementation for F# following Roslyn design principles.
Our compiler testing framework can be simplified; it will not read files on disk in order to run a test that verifies parsing and type-checking behavior.

The Current Situation and why it's a Problem

FCS provides services to compute analysis information from inputs. For the current API, some of these inputs are filenames, and so FCS relies in part on the state of the file system, which is highly mutable state and is highly problematic.

Specifically, when requesting the analysis of a file in a project (e.g. for a refactoring or a tooltip), the state of the current file is captured as a "snapshot", but the state of other files in the project is accessed via the file system as the analysis proceeds. This causes four problems:

A. The state of these files as saved on disk may have changed in-between

B. Differences between the saved and unsaved contents of in-memory buffers in the IDE.

C. It is extremely error-prone to implement incremental updates to analysis w.r.t. incremental changes in input

Additionally, FCS had two other major problems:

D. FCS is stateful and implements multiple kinds of caching for parsing and analysis

E. FCS was single-threaded, with a "reactor thread" compilation lock

Problem A leads to:

Repeated polling checks on timestamps of files whenever checking for validity of the results
Is a very frequent cause of bugs (BUG LINKS)

Problem B leads to:

Confusion for the user who doesn't understand that prior files must be saved.
Double type-checking of open files in a project when a change is made: once as part of the "background" build that is done using on-disk representations, once as part of the "foreground" build in order to get diagnostics for currently open documents. This is not something users perceive, but is a potential overall performance gain we can deliver for large projects.
Unnecessary complexity and distinctions in the FCS API that can makes it difficult to understand what's going - for example, we must document and test the differences between foreground and background checking.
A slew of bugs in cross-file refactorings (LINK LINK), cross-file goto-definition, cross-file tooltips (based on saved, not unsaved contents)
Missed opportunities to take advantage of F# language features for more efficient incremental checking, in particular signature files.

Problems C & D leads to a slew of bugs related to not invalidating cache entries with regard to changes in on-disk files. Problem D also causes issues with memory usage and too many analysis results being "kept live" by the FCS caches.

Problems A-C also apply to the "referenced assemblies" inputs to analysis, particularly cross-project references.
Before the start of this work, specifying a cross-project reference was done via a graph of FSharpProjectOptions,
but no in-memory cross-project references were allowed to C# projects. Further, the cross-project references
lead to reading input files from disk for other projects and assessing their timestamps, leading to bugs
and inconsistencies.

In combination these issues lead to a kind of "grid lock" where the root causes of the kinds of bugs we see are not addressed using best-known techniques. We patch a few bugs, which can cause other bugs etc. We know the solution to unlock this, which is to follow the design principles used by Roslyn.

Aside: Problem B can be partly addressed by an existing "hack" in the FCS API that allows the file system used by FCS to be "shimmed". This is used by JetBrains Rider in order to implement in-memory documents. However this is an awkward solution that differs greatly from the Roslyn approach, and Problems A, C and D still remains.

What's Needed

The Roslyn approach to these problems is to

Make all inputs to analysis be "snapshot" objects
Make all analysis results to be on-demand stateless enrichments of these snapshots
Do not implement adhoc caching of analysis objects within Roslyn, but rather allow liveness of analysis objects to determine lifetimes.

The technical agenda is based on transforming FCS to correspond to these principles.

Aside: when we say Roslyn analysis objects (e.g. Compilation LINK) are "on-demand stateless enrichments", this means there may be internal state recording what enrichments have already been computed, and this may be important or reasoning about memory usage. However, logically speaking, the analysis objects are still functional enrichments. Roslyn analysis objects are effectively like a composition of multiple lazy values - computed on-demand.

Technical Agenda

The agenda is as follows. Where new constructs are brought into existence in the FCS API, we show their correspondence to Roslyn equivalents

Add IFSharpSourceText (corresponds to SourceText in Roslyn). This allows for immutable views of snapshots of buffers

This included adopting IFSharpSourceText for foreground analysis.

OBSERVABLE GAIN: Among other things this prevented copying entire source files, a major cause of GC, and removed a slew of bugs and workarounds related to out-of-date snapshots.
Rewrite the "IncrementalBuild" engine to use a build graph

This allows for simple and reliable implementation of on-demand stateless enrichments within the background build.

There were also several major cleanup steps preparing for this. For example the build graph must be correctly incremental between the "diagnostics+tooltips" portion of analysis results and the more costly "symbol usages" portion, see Stop incremental builder from accumulating TcSymbolUses/TcResolutions/etc. #11666.

OBSERVABLE GAIN: This enabled "in-memory cross project referencing" for F# projects referencing C# projects. This gives a much simpler user experience, because changes in C# projects are now reflected immediately in a F# project without having to compile the C# project on-disk, and analysis results are available in uncompiled solutions.
Make the build graph free-threaded

OBSERVABLE GAIN: With this change, FCS started supporting concurrent requests for analysis results. This massively improves the performance of analysis and the responsiveness of the IDE.
Make the build graph more incremental w.r.t. changes in implementation file without change in signature file

OBSERVABLE GAIN: This greatly improved performance of analysis in F# projects that contain signature files. The user sees diagnostics and other analysis results much more quickly when a change is made to an implementation file without change to the signature file.
Add FSharpReferencedProject, corresponding to Roslyn's CompilationReference

OBSERVABLE GAIN: This enabled "in-memory cross project referencing" for C# -> F# projects. This gives a much simpler user experience, because changes in C# projects are now reflected immediately in a F# project without having to compile the C# project on-disk, and analysis results are available in uncompiled solutions.
Add FSharpSource, corresponding to Roslyn's SyntaxTree. A prototype is in this PR. FSharpSource is now added but not yet used

An FSharpSource is an input to analysis and internally is an "IFSharpSourceText + F# parse tree". The background and foreground analysis routines will accept these objects.

NOTE: These are not incremental w.r.t. incremental changes in the IFSharpSourceText (adding, removing lines) though in theory they could allow some incrementality in future refinements.
Make Visual Studio provide FSharpSource objects based on live buffers. A prototype is part of [WIP] In-Memory documents for FCS #11588.

OBSERVABLE GAIN: This reliably implements the "in-memory documents" feature

OBSERVABLE GAIN: This gives reliable rename-refactor for unsaved files.

OBSERVABLE GAIN: This avoids a slew of other bugs and complexity going forward
Add FSharpProject, corresponding to Roslyn's Compilation. A prototype is in this PR.

An FSharpProject is a handle to the "outputs" of analysis. That is, an FSharpProject is a Roslyn-like on-demand analysis object which can be used to request analysis information, e.g. diagnostics, tool tips, symbols uses, across an entire project.

An FSharpProject is incremental w.r.t. to replacing the FSharpSource inputs. That is, the incrementality granularity is "replace the contents of an entire file". If, for example, the last FSharpSource in the compilation sequence is replaced, then the majority of the results of analysis will be re-used.

An FSharpProject is on-demand in multiple ways, all mediated by the internal build graph. For example, if diagnostics are requested for the 3rd file out of 5 in the compilation sequence, only 3 files will be checked. If the diagnostics are then requested for the 5th file, the remaining two files will be checked. Some semantic information may be re-computed each time it is requested. Some may require more detail re-checking, e.g. to record symbol locations.

An FSharpProject can be created without needing an FSharpChecker. Thus they are not tied to the caches of FSharpChecker.

This addresses problem (D) above, in the sense that FCS clients like Visual Studio can choose their own lifetimes for FSharpProject objects (usually the lifetime of the associated Roslyn Workspace).
Make Visual Studio use FSharpProject objects instead of FSharpChecker. A prototype is part of [Experiment] FSharpProject snapshot #11775. A preparatory PR was [VS] Consolidate Roslyn workspace and FCS #11694

OBSERVABLE GAIN: This reduces memory usage

OBSERVABLE GAIN: This reduces bugginess due to invalidation, timing and state problems

This is also preparatory to reliably making FCS out-of-process with LSP.
Stabilize, document, refine the public APIs as part of making FCS a binary compatible component in support of F# Analyzers

And that is all.

Looking ahead Further Incrementality

One result of the above agenda is that is provides a basis to begin to implement finer-grained incremental adjustment of analysis results w.r.t. incremental changes in inputs. Currently (at the end of the above agenda) incrementality is at the granularity of replacing the contents of an entire file. We could now consider incrementality w.r.t. adding text at the end of a file, or changes within a line. This requires incremental parsing, checking. The aim here would be higher performance IDE analysis.

Roslyn supports this kind of incrementality but it is not an essential part of the above agenda.

LSP and Out of Process

Future changes to Roslyn will require F# to implement LSP, at least for the minimal of doing diagnostic analysis out-of-process (see
#11969, note this is a tiny part of LSP, and Ionide provides a full implementation).

An LSP implementation of F# will host FSharp.Compiler.Service and should ideally have an implementation architecture very similar to the C# out-of-proc LSP implementation. Completing this agenda allows us to use this approach. For example, the out-of-proc process will mirror the Roslyn workspace and hold handles to the appropriate FSharpProject objects, just as the C# version of the same holds a Roslyn Compilation object.

Crucially, this means the F# LSP implementation will be simple, reliable and relatively stateless (apart from holding FSharpProject
objects).

auduchinok, eriawan, vzarytovskii, baronfel, tannergooding and 34 morepeq, dsyme, adelarsq, sshquack, AbrahamAlcaina and 7 moregoswinr, saul, forki, ScottArbeit, dsyme and 6 morekerams, adelarsq, Trigun27 and joshuapassos

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4