A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/isrc-cas/rvv-llvm/commit/fc14172539eb1747cb4f92e47d928f0fbc4902b2 below:

apply patch from branch rvv-2018-11-01 of rvv-llvm · plctlab/llvm-project@fc14172 · GitHub

File tree Expand file treeCollapse file tree 41 files changed

+1585

-36

lines changed

Filter options

Expand file treeCollapse file tree 41 files changed

+1585

-36

lines changed Original file line number Diff line number Diff line change

@@ -54,3 +54,5 @@ autoconf/autom4te.cache

54 54

.vs

55 55

# clangd index

56 56

.clangd

57 +

# patch reject

58 +

*.rej

Original file line number Diff line number Diff line change

@@ -1,18 +1,12 @@

1 -

The LLVM Compiler Infrastructure

2 -

================================

3 - 4 -

This directory and its subdirectories contain source code for LLVM,

5 -

a toolkit for the construction of highly optimized compilers,

6 -

optimizers, and runtime environments.

7 - 8 -

LLVM is open source software. You may freely distribute it under the terms of

9 -

the license agreement found in LICENSE.txt.

10 - 11 -

Please see the documentation provided in docs/ for further

12 -

assistance with LLVM, and in particular docs/GettingStarted.rst for getting

13 -

started with LLVM and docs/README.txt for an overview of LLVM's

14 -

documentation setup.

15 - 16 -

If you are writing a package for LLVM, see docs/Packaging.rst for our

17 -

suggestions.

18 - 1 +

This repository contains a fork of LLVM (see https://llvm.org) with patches

2 +

towards supporting the RISC-V vector extension (see

3 +

https://github.com/riscv/riscv-v-spec/). This is very much a work in progress.

4 + 5 +

See `docs/RISCVVectorCodegen.md` for some documentation.

6 + 7 +

These patches are regularly rebased to track LLVM trunk, which means commit

8 +

hashes will change. To mitigate the annoyance that causes, development always

9 +

happens on a branch named `rvv-<date>` where `<date>` is the date of the last

10 +

rebase, and on the next rebase this branch is left alone and a new branch is

11 +

created. The downside of this is that you will have to check the list of

12 +

branches to find the latest version.

Original file line number Diff line number Diff line change

@@ -0,0 +1,69 @@

1 +

Code Generation for the RISC-V Vector Extension

2 +

===============================================

3 + 4 +

The vector processing capabilities added by the vector extension are very flexible, but this flexibility introduces some unique challenges in generating correct code for it, let alone optimized code. This document gives a bird's eye view of how vector values and operations are handled throughout the RISCV backend to solve these challenges.

5 + 6 +

## Context

7 + 8 +

The LLVM IR we receive as input explicitly operates on vectors of unknown length (e.g., `<scalable 1 x i32>`). The number of elements in an IR vector can be a compile-time integer multiple of the unknown `vscale` (e.g. `<scalable 4 x i32>`), but only a multiple of 1 is legal in the RISCV backend. The unknown factor of the vector element count (`vscale`) corresponds directly to the RVV *Maximum Vector Length* (MVL).

9 + 10 +

The IR occasionally uses RISCV-specific intrinsics that take and return the *active vector length* as an integer value, as in this fragment:

11 + 12 +

```

13 +

%vl = call i32 @llvm.riscv.setvl(i32 %n)

14 +

%a = call <scalable 1 x i32> @llvm.riscv.vlw(i32* %p, i32 %vl)

15 +

```

16 + 17 +

However, there are also target-independent operations such as `add <scalable 1 x i32>` that have no such concept.

18 + 19 +

## Instruction Selection

20 + 21 +

Instruction selection ignores the vector unit configuration entirely: it operates as if we had a full hard-wired set of regular vector registers with some unknown-but-fixed number of elements. This is done by using pseudo-instructions with suffix `_ImpConf` (*imp*licit *conf*iguration).

22 + 23 +

In general these instructions would omit an important dependency and therefore open us up to miscompilations, but during instruction selection it's fine because we select no instructions that *change* the configuration (that comes later).

24 + 25 +

Another concern is the active vector length, which is an operand to most vector instructions we want to select. For IR instructions such as `add <scalable 1 x i32>` that have no VL operand in IR, we use MVL for that operation (and hope to optimize it later). For RISCV-specific intrinsics that handle the active vector length, we just use the integer passed to that intrinsic.

26 + 27 +

## Vector length register

28 + 29 +

We represent the VL CSR as an ordinary physical register that can hold an XLEN-sized integer (`XLenVT`). There is a register class, `VLR`, that contains only VL. Most vector instructions have `VLR` inputs, `setvl` has a `VLR` output (along with the GPR result).

30 + 31 +

This register class is naturally not the preferred one for selecting integer code, but copies between it and GPRs are possible via CSR reads and writes.

32 + 33 +

> Note: Still need to decide on a solution for VLR virtregs with overlapping live ranges. As-is they cause crashes during register allocation.

34 + 35 +

## Vector unit configuration CSR

36 + 37 +

Ẃe represent the vector unit configuration CSR as an unallocatable, permanently reserved physical register.

38 +

It is implicitly used by most vector instructions, and instructions that change the configuration define this register implicitly.

39 +

There are no corresponding virtual registers, it's a physical register operand from the very start.

40 + 41 +

## Selecting the active vector length

42 + 43 +

> Note: This part is not implemented yet.

44 + 45 +

RISC-V vector instructions are generally controlled by the VL register and only process the lanes up to VL and zero the remaining elements of the destination. This does not match LLVM IR semantics, where many vector operations operate on the full vector and prediction is often applied as a separate operation afterwards.

46 + 47 +

For correctness, the RISCV backend must therefore be prepared to generate code that sets VL to MAXVL, i.e., operates on full vectors. This is particularly relevant for register copies and spill/fill code, but may also sometimes be required for arithmetic operations (though it shouldn't be required for IR produced by a V-aware vectorizer).

48 + 49 +

Nevertheless, we really want to use VL in the natural way. Thus we perform a "demanded elements" analysis which symbolically computes .

50 + 51 +

This is only correct for side-effect-free operations, but operations with side effects need to be "inherently predicated" in the IR already (e.g., consider `@llvm.masked.load` versus a regular vector load).

52 + 53 +

## Deciding the configuration

54 + 55 +

The configuration is decided before register allocation, because register allocation needs complete knowledge of the physical register field.

56 +

Integrating these decisions into register allocation would be even better, but it's not clear how to do that (and it would be likely a huge project in any case).

57 +

By looking at the live intervals, proportions of data type widths among them, etc. we can at least try to heuristically find a good fit.

58 + 59 +

In the future we should try to separate the function body into several regions between which we can safely change the configuration (at minimum this means no vector values live between them, not even in memory).

60 + 61 +

## Scalar operations in vector registers

62 + 63 +

Details TBD, see https://lists.llvm.org/pipermail/llvm-dev/2018-October/126733.html for some thoughts on this

64 + 65 +

## Emitting configuration instructions

66 + 67 +

Currently the vector unit is configured in the prologue and disabled in the epilogue.

68 + 69 +

In the future, especially once we start to support more than one configurations per function, we may insert configuration changes earlier, and also place them closer to the code that actually uses the vector unit.

Original file line number Diff line number Diff line change

@@ -2032,6 +2032,8 @@ class ShuffleVectorInst : public Instruction {

2032 2032

static void getShuffleMask(const Constant *Mask,

2033 2033

SmallVectorImpl<int> &Result);

2034 2034 2035 +

static bool getShuffleMask(Value *Mask, SmallVectorImpl<int> &Result);

2036 + 2035 2037

/// Return the mask for this instruction as a vector of integers. Undefined

2036 2038

/// elements of the mask are returned as -1.

2037 2039

void getShuffleMask(SmallVectorImpl<int> &Result) const {

Original file line number Diff line number Diff line change

@@ -285,6 +285,8 @@ def llvm_v2f64_ty : LLVMType<v2f64>; // 2 x double

285 285

def llvm_v4f64_ty : LLVMType<v4f64>; // 4 x double

286 286

def llvm_v8f64_ty : LLVMType<v8f64>; // 8 x double

287 287 288 +

def llvm_nxv1i32_ty : LLVMType<nxv1i32>; // scalable 1 x i32

289 + 288 290

def llvm_vararg_ty : LLVMType<isVoid>; // this means vararg here

289 291 290 292

//===----------------------------------------------------------------------===//

@@ -1230,6 +1232,8 @@ let IntrProperties = [IntrNoMem, IntrWillReturn] in {

1230 1232

[llvm_anyvector_ty]>;

1231 1233

def int_experimental_vector_reduce_fmin : Intrinsic<[LLVMVectorElementType<0>],

1232 1234

[llvm_anyvector_ty]>;

1235 +

def int_experimental_vector_splatvector : Intrinsic<[LLVMVectorElementType<0>],

1236 +

[llvm_anyvector_ty]>;

1233 1237

}

1234 1238 1235 1239

//===---------- Intrinsics to control hardware supported loops ----------===//

Original file line number Diff line number Diff line change

@@ -65,4 +65,51 @@ def int_riscv_masked_cmpxchg_i64

65 65

llvm_i64_ty, llvm_i64_ty],

66 66

[IntrArgMemOnly, NoCapture<0>, ImmArg<4>]>;

67 67 68 +

//===----------------------------------------------------------------------===//

69 +

// Vector extension

70 + 71 +

def int_riscv_setvl : Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem]>;

72 + 73 +

def int_riscv_vadd : Intrinsic<[llvm_nxv1i32_ty],

74 +

[llvm_nxv1i32_ty, llvm_nxv1i32_ty, llvm_i32_ty],

75 +

[IntrNoMem]>;

76 + 77 +

def int_riscv_vsub : Intrinsic<[llvm_nxv1i32_ty],

78 +

[llvm_nxv1i32_ty, llvm_nxv1i32_ty, llvm_i32_ty],

79 +

[IntrNoMem]>;

80 + 81 + 82 +

def int_riscv_vmul : Intrinsic<[llvm_nxv1i32_ty],

83 +

[llvm_nxv1i32_ty, llvm_nxv1i32_ty, llvm_i32_ty],

84 +

[IntrNoMem]>;

85 + 86 + 87 +

def int_riscv_vand : Intrinsic<[llvm_nxv1i32_ty],

88 +

[llvm_nxv1i32_ty, llvm_nxv1i32_ty, llvm_i32_ty],

89 +

[IntrNoMem]>;

90 + 91 + 92 +

def int_riscv_vor : Intrinsic<[llvm_nxv1i32_ty],

93 +

[llvm_nxv1i32_ty, llvm_nxv1i32_ty, llvm_i32_ty],

94 +

[IntrNoMem]>;

95 + 96 +

def int_riscv_vxor : Intrinsic<[llvm_nxv1i32_ty],

97 +

[llvm_nxv1i32_ty, llvm_nxv1i32_ty, llvm_i32_ty],

98 +

[IntrNoMem]>;

99 + 100 +

def int_riscv_vlw : Intrinsic<[llvm_nxv1i32_ty],

101 +

[llvm_ptr32_ty, llvm_i32_ty],

102 +

[IntrReadMem]>;

103 +

def int_riscv_vsw : Intrinsic<[],

104 +

[llvm_ptr32_ty, llvm_nxv1i32_ty, llvm_i32_ty],

105 +

[IntrWriteMem]>;

106 + 107 +

def int_riscv_vmpopcnt : Intrinsic<[llvm_i32_ty],

108 +

[llvm_nxv1i32_ty, llvm_i32_ty],

109 +

[IntrNoMem]>;

110 + 111 +

def int_riscv_vmfirst : Intrinsic<[llvm_i32_ty],

112 +

[llvm_nxv1i32_ty, llvm_i32_ty],

113 +

[IntrNoMem]>;

114 + 68 115

} // TargetPrefix = "riscv"

Original file line number Diff line number Diff line change

@@ -1545,6 +1545,17 @@ SDValue SelectionDAGBuilder::getValueImpl(const Value *V) {

1545 1545

Op = DAG.getConstantFP(0, getCurSDLoc(), EltVT);

1546 1546

else

1547 1547

Op = DAG.getConstant(0, getCurSDLoc(), EltVT);

1548 + 1549 +

if (VT.isScalableVector()) {

1550 +

auto INum = DAG.getConstant(Intrinsic::experimental_vector_splatvector,

1551 +

getCurSDLoc(), MVT::i32);

1552 + 1553 +

auto Splat = DAG.getNode(ISD::INTRINSIC_WO_CHAIN, getCurSDLoc(), VT,

1554 +

INum, Op);

1555 + 1556 +

return Splat;

1557 +

}

1558 + 1548 1559

Ops.assign(NumElements, Op);

1549 1560

}

1550 1561

@@ -3538,17 +3549,51 @@ void SelectionDAGBuilder::visitExtractElement(const User &I) {

3538 3549

void SelectionDAGBuilder::visitShuffleVector(const User &I) {

3539 3550

SDValue Src1 = getValue(I.getOperand(0));

3540 3551

SDValue Src2 = getValue(I.getOperand(1));

3552 +

Value *MaskV = I.getOperand(2);

3541 3553

SDLoc DL = getCurSDLoc();

3542 3554 3543 -

SmallVector<int, 8> Mask;

3544 -

ShuffleVectorInst::getShuffleMask(cast<Constant>(I.getOperand(2)), Mask);

3545 -

unsigned MaskNumElts = Mask.size();

3546 - 3547 3555

const TargetLowering &TLI = DAG.getTargetLoweringInfo();

3548 3556

EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());

3557 +

bool IsScalable = VT.isScalableVector();

3549 3558

EVT SrcVT = Src1.getValueType();

3550 3559

unsigned SrcNumElts = SrcVT.getVectorNumElements();

3551 3560 3561 +

SmallVector<int, 8> Mask;

3562 +

if (!ShuffleVectorInst::getShuffleMask(MaskV, Mask)) {

3563 +

SDValue Mask = getValue(I.getOperand(2));

3564 +

unsigned NumElts = VT.getVectorNumElements();

3565 +

// We don't currently support variable shuffles on fixed-length vectors

3566 +

assert(IsScalable && "Non-constant shuffle mask on fixed-length vector");

3567 + 3568 +

// We haven't introduced a vector_shuffle_var intrinsic to support shuffles

3569 +

// where we need to extract or merge vectors.

3570 +

if (NumElts != SrcNumElts)

3571 +

llvm_unreachable("Haven't implemented VECTOR_SHUFFLE_VAR intrinsic yet");

3572 + 3573 +

// Currently only handling splats of a single value for scalable vectors

3574 +

if (auto *CMask = dyn_cast<Constant>(MaskV))

3575 +

if (CMask->isNullValue()) {

3576 +

// Splat of first element.

3577 +

auto FirstElt = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL,

3578 +

SrcVT.getScalarType(), Src1,

3579 +

DAG.getConstant(0, DL,

3580 +

TLI.getVectorIdxTy(DAG.getDataLayout())));

3581 + 3582 +

auto INum = DAG.getConstant(Intrinsic::experimental_vector_splatvector,

3583 +

DL, MVT::i32);

3584 + 3585 +

auto Splat = DAG.getNode(ISD::INTRINSIC_WO_CHAIN, DL, VT,

3586 +

INum, FirstElt);

3587 + 3588 +

setValue(&I, Splat);

3589 +

return;

3590 +

}

3591 +

llvm_unreachable("Haven't implemented VECTOR_SHUFFLE_VAR intrinsic yet");

3592 +

return;

3593 +

}

3594 + 3595 +

unsigned MaskNumElts = Mask.size();

3596 + 3552 3597

if (SrcNumElts == MaskNumElts) {

3553 3598

setValue(&I, DAG.getVectorShuffle(VT, DL, Src1, Src2, Mask));

3554 3599

return;

Original file line number Diff line number Diff line change

@@ -797,8 +797,9 @@ Constant *llvm::ConstantFoldExtractElementInstruction(Constant *Val,

797 797 798 798

if (ConstantInt *CIdx = dyn_cast<ConstantInt>(Idx)) {

799 799

// ee({w,x,y,z}, wrong_value) -> undef

800 -

if (CIdx->uge(Val->getType()->getVectorNumElements()))

801 -

return UndefValue::get(Val->getType()->getVectorElementType());

800 +

if (!Val->getType()->getVectorIsScalable())

801 +

if (CIdx->uge(Val->getType()->getVectorNumElements()))

802 +

return UndefValue::get(Val->getType()->getVectorElementType());

802 803

return Val->getAggregateElement(CIdx->getZExtValue());

803 804

}

804 805

return nullptr;

@@ -810,6 +811,10 @@ Constant *llvm::ConstantFoldInsertElementInstruction(Constant *Val,

810 811

if (isa<UndefValue>(Idx))

811 812

return UndefValue::get(Val->getType());

812 813 814 +

// Everything after this point assumes you can iterate across Val.

815 +

if (Val->getType()->getVectorIsScalable())

816 +

return nullptr;

817 + 813 818

ConstantInt *CIdx = dyn_cast<ConstantInt>(Idx);

814 819

if (!CIdx) return nullptr;

815 820

@@ -837,8 +842,9 @@ Constant *llvm::ConstantFoldInsertElementInstruction(Constant *Val,

837 842

Constant *llvm::ConstantFoldShuffleVectorInstruction(Constant *V1,

838 843

Constant *V2,

839 844

Constant *Mask) {

840 -

unsigned MaskNumElts = Mask->getType()->getVectorNumElements();

841 -

Type *EltTy = V1->getType()->getVectorElementType();

845 +

auto *MaskTy = cast<VectorType>(Mask->getType());

846 +

auto MaskNumElts = MaskTy->getElementCount();

847 +

Type *EltTy = V1->getType()->getVectorElementType();

842 848 843 849

// Undefined shuffle mask -> undefined value.

844 850

if (isa<UndefValue>(Mask))

@@ -847,11 +853,23 @@ Constant *llvm::ConstantFoldShuffleVectorInstruction(Constant *V1,

847 853

// Don't break the bitcode reader hack.

848 854

if (isa<ConstantExpr>(Mask)) return nullptr;

849 855 856 +

if (MaskTy->isScalable()) {

857 +

// Is splat?

858 +

if (Mask->isNullValue()) {

859 +

Constant *Zero = Constant::getNullValue(MaskTy->getElementType());

860 +

Constant *SplatVal = ConstantFoldExtractElementInstruction(V1, Zero);

861 +

// Is splat of zero?

862 +

if (SplatVal && SplatVal->isNullValue())

863 +

return Constant::getNullValue(VectorType::get(EltTy, MaskNumElts));

864 +

}

865 +

return nullptr;

866 +

}

867 + 850 868

unsigned SrcNumElts = V1->getType()->getVectorNumElements();

851 869 852 870

// Loop over the shuffle mask, evaluating each element.

853 871

SmallVector<Constant*, 32> Result;

854 -

for (unsigned i = 0; i != MaskNumElts; ++i) {

872 +

for (unsigned i = 0; i != MaskNumElts.Min; ++i) {

855 873

int Elt = ShuffleVectorInst::getMaskValue(Mask, i);

856 874

if (Elt == -1) {

857 875

Result.push_back(UndefValue::get(EltTy));

Original file line number Diff line number Diff line change

@@ -2166,8 +2166,9 @@ Constant *ConstantExpr::getShuffleVector(Constant *V1, Constant *V2,

2166 2166

return FC; // Fold a few common cases.

2167 2167 2168 2168

unsigned NElts = Mask->getType()->getVectorNumElements();

2169 +

bool Scalable = Mask->getType()->getVectorIsScalable();

2169 2170

Type *EltTy = V1->getType()->getVectorElementType();

2170 -

Type *ShufTy = VectorType::get(EltTy, NElts);

2171 +

Type *ShufTy = VectorType::get(EltTy, NElts, Scalable);

2171 2172 2172 2173

if (OnlyIfReducedTy == ShufTy)

2173 2174

return nullptr;

Original file line number Diff line number Diff line change

@@ -149,7 +149,7 @@ class ShuffleVectorConstantExpr : public ConstantExpr {

149 149

ShuffleVectorConstantExpr(Constant *C1, Constant *C2, Constant *C3)

150 150

: ConstantExpr(VectorType::get(

151 151

cast<VectorType>(C1->getType())->getElementType(),

152 -

cast<VectorType>(C3->getType())->getNumElements()),

152 +

cast<VectorType>(C3->getType())->getElementCount()),

153 153

Instruction::ShuffleVector,

154 154

&Op<0>(), 3) {

155 155

Op<0>() = C1;

You can’t perform that action at this time.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4