LLVM
8.0.1
|
This pass adds instructions to enable whole quad mode for pixel shaders, and whole wavefront mode for all programs. More...
#include "AMDGPU.h"
#include "AMDGPUSubtarget.h"
#include "SIInstrInfo.h"
#include "SIMachineFunctionInfo.h"
#include "MCTargetDesc/AMDGPUMCTargetDesc.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/PostOrderIterator.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/CodeGen/LiveInterval.h"
#include "llvm/CodeGen/LiveIntervals.h"
#include "llvm/CodeGen/MachineBasicBlock.h"
#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/MachineInstr.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/MachineOperand.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/SlotIndexes.h"
#include "llvm/CodeGen/TargetRegisterInfo.h"
#include "llvm/IR/CallingConv.h"
#include "llvm/IR/DebugLoc.h"
#include "llvm/MC/MCRegisterInfo.h"
#include "llvm/Pass.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/raw_ostream.h"
#include <cassert>
#include <vector>
Go to the source code of this file.
Macros | |
#define | DEBUG_TYPE "si-wqm" |
Enumerations | |
enum | |
Functions | |
INITIALIZE_PASS_BEGIN (SIWholeQuadMode, DEBUG_TYPE, "SI Whole Quad Mode", false, false) INITIALIZE_PASS_END(SIWholeQuadMode | |
Variables | |
DEBUG_TYPE | |
SI Whole Quad | Mode |
SI Whole Quad | false |
This pass adds instructions to enable whole quad mode for pixel shaders, and whole wavefront mode for all programs.
Whole quad mode is required for derivative computations, but it interferes with shader side effects (stores and atomics). This pass is run on the scheduled machine IR but before register coalescing, so that machine SSA is available for analysis. It ensures that WQM is enabled when necessary, but disabled around stores and atomics.
When necessary, this pass creates a function prolog
S_MOV_B64 LiveMask, EXEC S_WQM_B64 EXEC, EXEC
to enter WQM at the top of the function and surrounds blocks of Exact instructions by
S_AND_SAVEEXEC_B64 Tmp, LiveMask ... S_MOV_B64 EXEC, Tmp
We also compute when a sequence of instructions requires Whole Wavefront Mode (WWM) and insert instructions to save and restore it:
S_OR_SAVEEXEC_B64 Tmp, -1 ... S_MOV_B64 EXEC, Tmp
In order to avoid excessive switching during sequences of Exact instructions, the pass first analyzes which instructions must be run in WQM (aka which instructions produce values that lead to derivative computations).
Basic blocks are always exited in WQM as long as some successor needs WQM.
There is room for improvement given better control flow analysis:
(1) at the top level (outside of control flow statements, and as long as kill hasn't been used), one SGPR can be saved by recovering WQM from the LiveMask (this is implemented for the entry block).
(2) when entire regions (e.g. if-else blocks or entire loops) only consist of exact and don't-care instructions, the switch only has to be done at the entry and exit points rather than potentially in each block of the region.
Definition in file SIWholeQuadMode.cpp.
#define DEBUG_TYPE "si-wqm" |
Definition at line 90 of file SIWholeQuadMode.cpp.
anonymous enum |
Definition at line 94 of file SIWholeQuadMode.cpp.
INITIALIZE_PASS_BEGIN | ( | SIWholeQuadMode | , |
DEBUG_TYPE | , | ||
"SI Whole Quad Mode" | , | ||
false | , | ||
false | |||
) |
DEBUG_TYPE |
Definition at line 216 of file SIWholeQuadMode.cpp.
SI Whole Quad false |
Definition at line 216 of file SIWholeQuadMode.cpp.
SI Whole Quad Mode |
Definition at line 216 of file SIWholeQuadMode.cpp.
Referenced by llvm::AAResults::canBasicBlockModify(), combineLoopMAddPattern(), ContainsReg(), llvm::FileOutputBuffer::create(), createInMemoryBuffer(), llvm::createSIModeRegisterPass(), findIncDecAfter(), llvm::raw_ostream::getBufferStart(), llvm::TGParser::getDependencies(), getLAScore(), INITIALIZE_PASS(), llvm::IsCPSRDead< MCInst >(), isIndirectBrTarget(), isMemberPointer(), isWeak(), matchPMADDWD(), llvm::ARMInstPrinter::printLdStmModeOperand(), reduceVMULWidth(), llvm::HexagonMCChecker::reportBranchErrors(), llvm::raw_ostream::SetBuffered(), llvm::TargetMachine::setGlobalISelAbort(), shouldEmitUdt(), tieOpsIfNeeded(), and llvm::AntiDepBreaker::~AntiDepBreaker().