On 7/30/20 11:11 AM, Renato Golin wrote: > On Thu, 30 Jul 2020 at 16:58, Johannes Doerfert > <johannesdoerfert at gmail.com> wrote: >> I mean, you can put the command line string that set the options into >> the first place, right? That is as long as it initially was, or maybe I >> am missing something. > > Options change with time, and this would make the IR incompatible > across releases without intentionally doing so. You could arguably be forgiving when it comes to the parsing of these so you might loose some if you mix IR across releases but right now you cannot express this at all. I mean, IR looks as if it captures the entire state but not quite. As a use case, the question how to reproduce `clang -O3` with opt comes up every month or so on the list. Let's table this for now as it seems unrelated to this proposal. >> To recap things that might "differ" from the original proposal: >>   - We         want multiple target triples. >>   - We probably want multiple data layouts. >>   - We probably want multiple pass pipelines, with different (cmd >>     line) options and such. >>   - We might want to make modules self contained wrt. target options >>     such that you can create TTI and friends w/o repeating driver >>     options. > > The extent of the separation is what made me suggest that it might be > easier, in the end, to carry multiple modules, from different > front-ends, through multiple pipelines but interacting with each > other. > > I guess this is why David made a parallel with LTO, as this ends up as > being a multi-device LTO in a sense. I think that will be easier and > much less intrusive than rewriting the global context, target flags, > IR annotation, data layout assumptions, target triple parsing, target > options bundling, etc. It is definitively multi-device (link time) optimization. The link time part is somewhat optional and might be misleading given the popularity of single source programming models for accelerators. The "thinLTO" idea would also not be sufficient for everything we hope to do, the two module approach would be though. What if we don't rewrite these things but still merge the modules? Let me explain ;) (I use `opt` invocations below as a placeholder for the lack of a better  term but knowing it is not (only) the `opt` tool we talk about.) The problem is that the `opt` invocation is primed for a single target, everything (=pipeline, TTI, flags, ...) exists only once, right? I imagine the two module approach to run two `opt` invocations, one for each module, which we would synchronize at some point to do cross-module optimizations. Given that we can run two `opt` invocations and we assume a pass can work with two modules, that is two sets of everything, why do we need the separation? From a tooling perspective I think it makes things easier to have a single module. That said, it should not preclude us to run two separate `opt` invocations on it. So we don't rewrite everything but instead "just" need to duplicate all the information in the IR such that each `opt` invocation can extract it's respective set of values and run on the respective set of global symbols. This would reduce the new stuff to more or less what we started with: device triple & DL, and a way to link global symbol to a device triple & DL. It is the two module approach but with "co-located" modules ;) WDYT? ~ Johannes P.S. This is really helpful but I won't give up so easily on the idea.     If I do, I have to implement cross module optimizations and I would     rather not ;)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4