It'd be useful in your proposal to to note which of the existing llvm target specific intrinsics this generic intrinsic can effectively supersede. (E.g. llvm.x86.pclmulqdq for x86.) When we are already supporting a given function via target specific intrinsics for a number of different targets, that seems a pretty good argument for making it available as a more generic target independent intrinsic. On Sun, Jul 5, 2020, 5:18 AM Shawn Landden via llvm-dev < llvm-dev at lists.llvm.org> wrote: > > > Carry-less multiplication[1] instructions exist (at least optionally) on > many architectures: armv8, RISC-V, x86_64, POWER, SPARC, C64x, and possibly > more. > > This proposal is to add a llvm.clmul instruction. Or if that is > contentious, llvm.experimental.bitmanip.clmul instruction. It takes two > integer operands of the same width, and returns an integer with twice the > width of the operands. (Is there a good reason to make these the same > width, as all the other operations do even when it doesnât really make > sense for the mathematical operationâlike multiplication or > ctpop/ctlz/cttz?) > > If the CPU does not have a dedication clmul operation, it can be lowered > to regular multiplication, by using holes to avoid carrys. > > ==Where is clmul used?== > > While somewhat specialized, the RISC-V manual documents many uses: [2] > > The classic applications forclmulare Cyclic Redundancy Check (CRC) [11, 26] > > and Galois/CounterMode (GCM), but more applications exist, including the > following examples.There are obvious applications in hashing and pseudo > random number generations. For exam-ple, it has been reported that hashes > based on carry-less multiplications can outperform GoogleâsCityHash [17]. > > clmulof a number with itself inserts zeroes between each input bit. This > can be useful for generatingMorton code [23]. > > clmulof a number with -1 calculates the prefix XOR operation. This can be > useful for decodinggray codes.Another application of XOR prefix sums > calculated withclmulis branchless tracking of quotedstrings in > high-performance parsers. [16] > > Carry-less multiply can also be used to implement Erasure code > efficiently. [14] > > ==clmul lowering without hardware support== > A 8x8=>16 clmul can also be lowered to a 32x32=>64 multiplication when > there is no specialized instruction (also 15x15=>30, to a 60x60=>120, or if > bitreverse is available 16x16=>32 to TWO 64x64=>64 multiplications)[3]. > > [1] https://en.wikipedia.org/wiki/Carry-less_product > [2] (page 30) > <https://en.wikipedia.org/wiki/Carry-less_product%5B2%5D%20(page%2030)%20https://raw.githubusercontent.com/riscv/riscv-bitmanip/master/bitmanip-0.92.pdf%5B3%5D%20https://www.bearssl.org/constanttime.html> > https://raw.githubusercontent.com/riscv/riscv-bitmanip/master/bitmanip-0.92.pdf > [3] > <https://en.wikipedia.org/wiki/Carry-less_product%5B2%5D%20(page%2030)%20https://raw.githubusercontent.com/riscv/riscv-bitmanip/master/bitmanip-0.92.pdf%5B3%5D%20https://www.bearssl.org/constanttime.html> > https://www.bearssl.org/constanttime.html > > > > (First posted to discord > -- > Shawn Landden > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200705/a56891ea/attachment.html>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4