Accumulate operator accumulates the input tensor to the output tensor. If the output tensor already has the right size, we add to it; otherwise, we first initialize the output tensor to all zeros, and then do accumulation. Any further calls to the operator, given that no one else fiddles with the output in the interim, will do simple accumulations. Accumulation is done using Axpby operation as shown:
Y = 1*X + gamma*Y
where X is the input tensor, Y is the output tensor and gamma is the multiplier argument.
CPU caffe2::AccumulateOp<float, caffe2::CPUContext>
GPU caffe2::AccumulateOp<float, caffe2::CUDAContext>
Accuracy takes two inputs- predictions and labels, and returns a float accuracy value for the batch. Predictions are expected in the form of 2-D tensor containing a batch of scores for various classes, and labels are expected in the form of 1-D tensor containing true label indices of samples in the batch. If the score for the label index in the predictions is the highest among all classes, it is considered a correct prediction.
CPU caffe2::AccuracyOp<float, caffe2::CPUContext>
GPU caffe2::AccuracyOp<float, caffe2::CUDAContext>
Performs element-wise binary addition (with limited broadcast support). If necessary the right-hand-side argument will be broadcasted to match the shape of left-hand-side argument. When broadcasting is specified, the second tensor can either be of size 1 (a scalar value), or having its shape as a contiguous subset of the first tensor‘s shape. The starting of the mutually equal shape is specified by the argument “axis”, and if it is not set, suffix matching is assumed. 1-dim expansion doesn’t work yet. For example, the following tensor shapes are supported (with broadcast=1):
shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar shape(A) = (2, 3, 4, 5), shape(B) = (5,) shape(A) = (2, 3, 4, 5), shape(B) = (4, 5) shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1 shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0
Argument broadcast=1
needs to be passed to enable broadcasting.
CPU caffe2::BinaryElementwiseOp<caffe2::TensorTypes<int, long, float, double>, caffe2::CPUContext, caffe2::EigenAddFunctor, caffe2::SameTypeAsInput>
GPU caffe2::BinaryElementwiseOp<caffe2::TensorTypes<int, long, float, double>, caffe2::CUDAContext, caffe2::CudaAddFunctor, caffe2::SameTypeAsInput>
Given a partitioned tensor T<N, D1..., Dn>, where the partitions are defined as ranges on its outer-most (slowest varying) dimension N, with given range lengths, return a tensor T<N + 2*pad_width, D1 ..., Dn> with paddings added to the start and end of each range. Optionally, different paddings can be provided for beginning and end. Paddings provided must be a tensor T<D1..., Dn>. If no padding is provided, add zero padding. If no lengths vector is provided, add padding only once, at the start and end of data.
caffe2::(anonymous namespace)::AddPaddingOp
Makes the output and the input share the same underlying storage. WARNING: in general, in caffe2's operator interface different tensors should have different underlying storage, which is the assumption made by components such as the dependency engine and memory optimization. Thus, in normal situations you should not use the AliasOp, especially in a normal forward-backward pass. The Alias op is provided so one can achieve true asynchrony, such as Hogwild, in a graph. But make sure you understand all the implications similar to multi-thread computation before you use it explicitly.
CPU caffe2::AliasOp<caffe2::CPUContext>
GPU caffe2::AliasOp<caffe2::CUDAContext>
Does an allgather operation among the nodes.
CPU caffe2::NoDefaultEngineOp<caffe2::CPUContext>
GPU caffe2::NoDefaultEngineOp<caffe2::CUDAContext>
Does an allreduce operation among the nodes. Currently only Sum is supported.
CPU caffe2::NoDefaultEngineOp<caffe2::CPUContext>
GPU caffe2::NoDefaultEngineOp<caffe2::CUDAContext>
Performs element-wise logical operation and
(with limited broadcast support). Both input operands should be of type bool
. If necessary the right-hand-side argument will be broadcasted to match the shape of left-hand-side argument. When broadcasting is specified, the second tensor can either be of size 1 (a scalar value), or having its shape as a contiguous subset of the first tensor‘s shape. The starting of the mutually equal shape is specified by the argument “axis”, and if it is not set, suffix matching is assumed. 1-dim expansion doesn’t work yet. For example, the following tensor shapes are supported (with broadcast=1):
shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar shape(A) = (2, 3, 4, 5), shape(B) = (5,) shape(A) = (2, 3, 4, 5), shape(B) = (4, 5) shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1 shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0
Argument broadcast=1
needs to be passed to enable broadcasting.
CPU caffe2::BinaryElementwiseOp<caffe2::TensorTypes<bool>, caffe2::CPUContext, caffe2::NaiveAndFunctor, caffe2::FixedType<bool> >
GPU caffe2::BinaryElementwiseOp<caffe2::TensorTypes<bool>, caffe2::CUDAContext, caffe2::CudaAndFunctor, caffe2::FixedType<bool> >
Append input 2 to the end of input 1. Input 1 must be the same as output, that is, it is required to be in-place. Input 1 may have to be re-allocated in order for accommodate to the new size. Currently, an exponential growth ratio is used in order to ensure amortized constant time complexity. All except the outer-most dimension must be the same between input 1 and 2.
caffe2::(anonymous namespace)::AppendOp<caffe2::CPUContext>
No documentation yet.
caffe2/operators/dataset_ops.cc
caffe2::(anonymous namespace)::AtomicAppendOp<caffe2::CPUContext>
Given a mutex and two int32 scalar tensors, performs an atomic fetch add by mutating the first argument and adding it to the second input argument. Returns the updated integer and the value prior to the update.
caffe2::fb::(anonymous namespace)::AtomicFetchAddOp
AveragePool consumes an input blob X and applies average pooling across the the blob according to kernel sizes, stride sizes, and pad lengths defined by the ConvPoolOpBase operator. Average pooling consisting of averaging all values of a subset of the input tensor according to the kernel size and downsampling the data into the output blob Y for further processing.
CPU caffe2::PoolOp<float, caffe2::CPUContext, caffe2::(anonymous namespace)::AveragePool>
GPU caffe2::PoolOp<float, caffe2::CUDAContext, caffe2::(anonymous namespace)::AveragePool>
CUDNN
on CUDA
No documentation yet.
CPU caffe2::PoolGradientOp<float, caffe2::CPUContext, caffe2::(anonymous namespace)::AveragePool>
GPU caffe2::PoolGradientOp<float, caffe2::CUDAContext, caffe2::(anonymous namespace)::AveragePool>
CUDNN
on CUDA
AveragedLoss takes in a 1-D tensor as input and returns a single output float value which represents the average of input data (average of the losses).
CPU caffe2::AveragedLoss<float, caffe2::CPUContext>
GPU caffe2::AveragedLoss<float, caffe2::CUDAContext>
No documentation yet.
CPU caffe2::AveragedLossGradient<float, caffe2::CPUContext>
GPU caffe2::AveragedLossGradientGPUSpecialization
Batch Matrix multiplication Yi = Ai * Bi, where A has size (C x M x K), B has size (C x K x N) where C is the batch size and i ranges from 0 to C-1.
CPU caffe2::BatchMatMulOp<float, caffe2::CPUContext, caffe2::DefaultEngine>
GPU caffe2::BatchMatMulOp<float, caffe2::CUDAContext, caffe2::DefaultEngine>
BatchToSpace for 4-D tensors of type T. Rearranges (permutes) data from batch into blocks of spatial data, followed by cropping. This is the reverse transformation of SpaceToBatch. More specifically, this op outputs a copy of the input tensor where values from the batch dimension are moved in spatial blocks to the height and width dimensions, followed by cropping along the height and width dimensions.
caffe2/operators/space_batch_op.cc
CPU caffe2::BatchToSpaceOp<caffe2::CPUContext>
GPU caffe2::BatchToSpaceOp<caffe2::CUDAContext>
Given a data 1D tensor and a mask (boolean) tensor of same shape, returns a tensor containing only the elements corresponding to positions where the mask is true.
caffe2::(anonymous namespace)::BooleanMaskOp<caffe2::CPUContext>
Given a tensor of int32 segment lengths and a mask (boolean) tensor, return the segment lengths of a corresponding segmented tensor after BooleanMask is applied.
caffe2::(anonymous namespace)::BooleanMaskLengthsOp<caffe2::CPUContext>
Does a broadcast operation from the root node to every other node. The tensor on each node should have been pre-created with the same shape and data type.
CPU caffe2::NoDefaultEngineOp<caffe2::CPUContext>
GPU caffe2::NoDefaultEngineOp<caffe2::CUDAContext>