RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/rust-lang/rust/issues/58967 below:

Tracking Issue for `self-profile` minimum viable product · Issue #58967 · rust-lang/rust · GitHub

Goals for the first usable iteration of -Zself-profile are:

Make the compiler track query invocations and other important function calls (e.g. LLVM related)
- This means tracking the query/function name (no query keys, arguments yet)
Reduce the overhead of tracking and profile generation as to not be prohibitive
- This means emitting events in an optimized binary format
Write a post-processing tool that generates an aggregated report from the raw event data
- The aggregated report is a table with one line per query/function and columns for
  - total time spent in the query (in milliseconds)
  - time spent in the query as percentage of total compile time
  - number of query invocations
  - percentage of in-memory cache hits
  - percentage of incremental cache hits
  - total time spent (milliseconds) in loading query results from incremental cache
  - total time spent (milliseconds) blocked on concurrent query invocations
Re-enable self-profiling on perf.rlo, which includes
- running the postprocessing tool to generate the report for each test run
- adding a new comparison view that compares the test runs of a single benchmark and shows changes per query. This view is reachable by clicking on a benchmark in the regular comparison view (i.e. one can "zoom" into a given benchmark)
Document how self-profiling works in the rustc-guide.

Non-Goals are:

Supporting self-profiling in 32-bit compilers -- this makes it easier to rely on things like memory mapped files
Tracking individual query keys/function arguments

Work packages resulting from this set of goals are:

Implement a library that takes care of reading and writing the binary event format
- See https://github.com/rust-lang/measureme/
Make the compiler use the library to emit profiling data efficiently
- Initial integration at Use measureme in self profiler #59515
- Implement "event filtering" in order to keep profiling overhead low in the common case. ( Implement event filtering for self-profiler. #59915)
- Add output directory argument to -Zself-profile ( Allow to specify profiling data output directory as -Zself-profile argument. #61123)
- Add a version header to profiler artifacts ( Add versioning to the binary profile format measureme#40)
Implement a postprocessing tool (using the library) that generates the aggregated report ( Implement a summarization tool for profile traces measureme#17)
Make perf.rlo support self-profile:
- run benchmarks with -Zself-profile
- run postprocessing tool
- store aggregated reports
- implement the detailed comparison view
- make the regular comparison view link to detailed views
Review and make sure that we are tracking everything we are interested in. Things to check:
- Pre-query passes (parsing, macro expansion, name resolution, HIR lowering, ...)
- LLVM optimization passes
- Metadata loading/decoding
- Trait selection (removed this from the MVP for now)
Document how self-profiling works in the rustc-guide.
Polishing iteration
- Write high-level crate docs for measureme (implemented in Write some crate level documentation measureme#68)
- Detailed view should show "percentage of total time" column ( detailed-view: Percentage of total time column missing rustc-perf#523)
- Show total sum line in table for the entire crate ( detailed-view: Add a "sum total" line at the top of the table. rustc-perf#525)
- Make sorting more visible/accessible in the results table ( detailed views: Make it more visible that columns can be sorted. rustc-perf#526)
- It's unclear what the "invocations" and "cache misses" columns in the detailed view are exactly. ( detailed (comparison) view: The cache misses column should be removed, invocations should be renamed rustc-perf#529)
- Resolve a bug in the sum for incr. loading time column ( detailed view: Possible bug in "totals" line for incr. comp. cache loading rustc-perf#527)
- Clean up self-time computation in summarize ( Event recording & summarize need cleanup with respect to self-time vs incr-load-time vs blocked-time vs total-time. measureme#75)

Possible Problems that might arise are:

Profiling overhead keeps being too high - then we need to think about doing separate self-profile runs on perf.rlo

cc @wesleywiser @Mark-Simulacrum

jens1o, mark-i-m, Farkal and bhgomes

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4