Tpetra parallel linear algebra
Version of the Day
|
Namespace for Tpetra implementation details. More...
Namespaces | |
DefaultTypes | |
Declarations of values of Tpetra classes' default template parameters. | |
Classes | |
struct | AbsMax |
Functor for the the ABSMAX CombineMode of Import and Export operations. More... | |
class | ContiguousUniformDirectory |
Implementation of Directory for a contiguous, uniformly distributed Map. More... | |
struct | CrsMatrixGetDiagCopyFunctor |
Functor that implements much of the one-argument overload of Tpetra::CrsMatrix::getLocalDiagCopy, for the case where the matrix is fill complete. More... | |
class | Directory |
Computes the local ID and process ID corresponding to given global IDs. More... | |
class | DistributedContiguousDirectory |
Implementation of Directory for a distributed contiguous Map. More... | |
class | DistributedNoncontiguousDirectory |
Implementation of Directory for a distributed noncontiguous Map. More... | |
class | FixedHashTable |
struct | Hash |
The hash function for FixedHashTable. More... | |
struct | Hash< KeyType, DeviceType, OffsetType, int > |
Specialization for ResultType = int. More... | |
class | HashTable |
class | InvalidGlobalIndex |
Exception thrown by CrsMatrix on invalid global index. More... | |
class | InvalidGlobalRowIndex |
Exception thrown by CrsMatrix on invalid global row index. More... | |
class | LocalMap |
"Local" part of Map suitable for Kokkos kernels. More... | |
class | MapCloner |
Implementation detail of Map::clone(). More... | |
struct | MultiVectorCloner |
Implementation of Tpetra::MultiVector::clone(). More... | |
class | MultiVectorFillerData |
Implementation of fill and local assembly for MultiVectorFiller. More... | |
class | MultiVectorFillerData2 |
Second implementation of fill and local assembly for MultiVectorFiller. More... | |
class | OptColMap |
Implementation detail of makeOptimizedColMap, and makeOptimizedColMapAndImport. More... | |
struct | OrdinalTraits |
Traits class for "invalid" (flag) values of integer types that Tpetra uses as local ordinals or global ordinals. More... | |
struct | PackTraits |
Traits class for packing / unpacking data of type T , using Kokkos data structures that live in the given space D . More... | |
class | ReplicatedDirectory |
Implementation of Directory for a locally replicated Map. More... | |
class | TieBreak |
Interface for breaking ties in ownership. More... | |
class | Transfer |
Common base class of Import and Export. More... | |
Enumerations | |
enum | EStorageStatus |
Status of the graph's or matrix's storage, when not in a fill-complete state. More... | |
enum | EDistributorSendType |
The type of MPI send that Distributor should use. More... | |
enum | EDistributorHowInitialized |
Enum indicating how and whether a Distributor was initialized. More... | |
Functions | |
template<class LO , class GO , class DT , class OffsetType , class NumEntType > | |
OffsetType | convertColumnIndicesFromGlobalToLocal (const Kokkos::View< LO *, DT > &lclColInds, const Kokkos::View< const GO *, DT > &gblColInds, const Kokkos::View< const OffsetType *, DT > &ptr, const LocalMap< LO, GO, DT > &lclColMap, const Kokkos::View< const NumEntType *, DT > &numRowEnt) |
Convert a (StaticProfile) CrsGraph's global column indices into local column indices. More... | |
template<class OffsetsViewType , class CountsViewType , class SizeType = typename OffsetsViewType::size_type> | |
OffsetsViewType::non_const_value_type | computeOffsetsFromCounts (const OffsetsViewType &ptr, const CountsViewType &counts) |
Compute offsets from counts. More... | |
template<class OffsetsViewType , class CountType , class SizeType = typename OffsetsViewType::size_type> | |
OffsetsViewType::non_const_value_type | computeOffsetsFromConstantCount (const OffsetsViewType &ptr, const CountType &count) |
Compute offsets from a constant count. More... | |
template<class OutputViewType , class InputViewType > | |
void | copyOffsets (const OutputViewType &dst, const InputViewType &src) |
Copy row offsets (in a sparse graph or matrix) from src to dst. The offsets may have different types. More... | |
void | gathervPrint (std::ostream &out, const std::string &s, const Teuchos::Comm< int > &comm) |
On Process 0 in the given communicator, print strings from each process in that communicator, in rank order. More... | |
template<class DiagType , class LocalMapType , class CrsMatrixType > | |
static LocalMapType::local_ordinal_type | getDiagCopyWithoutOffsets (const DiagType &D, const LocalMapType &rowMap, const LocalMapType &colMap, const CrsMatrixType &A) |
Given a locally indexed, local sparse matrix, and corresponding local row and column Maps, extract the matrix's diagonal entries into a 1-D Kokkos::View. More... | |
template<class SC , class LO , class GO , class NT > | |
LO | getLocalDiagCopyWithoutOffsetsNotFillComplete (::Tpetra::Vector< SC, LO, GO, NT > &diag, const ::Tpetra::RowMatrix< SC, LO, GO, NT > &A, const bool debug=false) |
Given a locally indexed, global sparse matrix, extract the matrix's diagonal entries into a Tpetra::Vector. More... | |
template<class MapType > | |
MapType | makeOptimizedColMap (std::ostream &errStream, bool &lclErr, const MapType &domMap, const MapType &colMap) |
Return an optimized reordering of the given column Map. More... | |
template<class MapType > | |
std::pair< MapType, Teuchos::RCP< typename OptColMap< MapType >::import_type > > | makeOptimizedColMapAndImport (std::ostream &errStream, bool &lclErr, const MapType &domMap, const MapType &colMap, const typename OptColMap< MapType >::import_type *oldImport, const bool makeImport) |
Return an optimized reordering of the given column Map. Optionally, recompute an Import from the input domain Map to the new column Map. More... | |
template<class OrdinalType , class IndexType > | |
IndexType | countMergeUnsortedIndices (const OrdinalType curInds[], const IndexType numCurInds, const OrdinalType inputInds[], const IndexType numInputInds) |
Count the number of column indices that can be merged into the current row, assuming that both the current row's indices and the input indices are unsorted. More... | |
template<class OrdinalType , class IndexType > | |
IndexType | countMergeSortedIndices (const OrdinalType curInds[], const IndexType numCurInds, const OrdinalType inputInds[], const IndexType numInputInds) |
Count the number of column indices that can be merged into the current row, assuming that both the current row's indices and the input indices are sorted. More... | |
template<class OrdinalType , class IndexType > | |
std::pair< bool, IndexType > | mergeSortedIndices (OrdinalType curInds[], const IndexType midPos, const IndexType endPos, const OrdinalType inputInds[], const IndexType numInputInds) |
Attempt to merge the input indices into the current row's column indices, assuming that both the current row's indices and the input indices are sorted. More... | |
template<class OrdinalType , class IndexType > | |
std::pair< bool, IndexType > | mergeUnsortedIndices (OrdinalType curInds[], const IndexType midPos, const IndexType endPos, const OrdinalType inputInds[], const IndexType numInputInds) |
Attempt to merge the input indices into the current row's column indices, assuming that both the current row's indices and the input indices are unsorted. More... | |
template<class OrdinalType , class ValueType , class IndexType > | |
std::pair< bool, IndexType > | mergeUnsortedIndicesAndValues (OrdinalType curInds[], ValueType curVals[], const IndexType midPos, const IndexType endPos, const OrdinalType inputInds[], const ValueType inputVals[], const IndexType numInputInds) |
Attempt to merge the input indices and values into the current row's column indices and corresponding values, assuming that both the current row's indices and the input indices are unsorted. More... | |
std::string | DistributorSendTypeEnumToString (EDistributorSendType sendType) |
Convert an EDistributorSendType enum value to a string. More... | |
std::string | DistributorHowInitializedEnumToString (EDistributorHowInitialized how) |
Convert an EDistributorHowInitialized enum value to a string. More... | |
template<class LocalOrdinal , class GlobalOrdinal , class Node > | |
bool | isLocallyFitted (const Tpetra::Map< LocalOrdinal, GlobalOrdinal, Node > &map1, const Tpetra::Map< LocalOrdinal, GlobalOrdinal, Node > &map2) |
Is map1 locally fitted to map2? More... | |
bool | congruent (const Teuchos::Comm< int > &comm1, const Teuchos::Comm< int > &comm2) |
Whether the two communicators are congruent. More... | |
template<class DualViewType > | |
Teuchos::ArrayView< typename DualViewType::t_dev::value_type > | getArrayViewFromDualView (const DualViewType &x) |
Get a Teuchos::ArrayView which views the host Kokkos::View of the input 1-D Kokkos::DualView. More... | |
template<class T , class DT > | |
Kokkos::DualView< T *, DT > | getDualViewCopyFromArrayView (const Teuchos::ArrayView< const T > &x_av, const char label[], const bool leaveOnHost) |
Get a 1-D Kokkos::DualView which is a deep copy of the input Teuchos::ArrayView (which views host memory). More... | |
Namespace for Tpetra implementation details.
Status of the graph's or matrix's storage, when not in a fill-complete state.
When a CrsGraph or CrsMatrix is not fill complete, its data live in one of three storage formats:
"2-D storage": The graph stores column indices as "array of arrays," and the matrix stores values as "array of arrays." The graph must have k_numRowEntries_ allocated. This only ever exists if the graph was created with DynamicProfile. A matrix with 2-D storage must own its graph, and the graph must have 2-D storage.
"Unpacked 1-D storage": The graph uses a row offsets array, and stores column indices in a single array. The matrix also stores values in a single array. "Unpacked" means that there may be extra space in each row: that is, the row offsets array only says how much space there is in each row. The graph must use k_numRowEntries_ to find out how many entries there actually are in the row. A matrix with unpacked 1-D storage must own its graph, and the graph must have unpacked 1-D storage.
With respect to the Kokkos refactor version of Tpetra, "2-D storage" should be considered a legacy option.
The phrase "When not in a fill-complete state" is important. When the graph is fill complete, it always uses 1-D "packed" storage. However, if storage is "not optimized," we retain the 1-D unpacked or 2-D format, and thus retain this enum value.
Definition at line 197 of file Tpetra_CrsGraph_decl.hpp.
The type of MPI send that Distributor should use.
This is an implementation detail of Distributor. Please do not rely on these values in your code.
Definition at line 76 of file Tpetra_Distributor.hpp.
Enum indicating how and whether a Distributor was initialized.
This is an implementation detail of Distributor. Please do not rely on these values in your code.
Definition at line 94 of file Tpetra_Distributor.hpp.
OffsetType Tpetra::Details::convertColumnIndicesFromGlobalToLocal | ( | const Kokkos::View< LO *, DT > & | lclColInds, |
const Kokkos::View< const GO *, DT > & | gblColInds, | ||
const Kokkos::View< const OffsetType *, DT > & | ptr, | ||
const LocalMap< LO, GO, DT > & | lclColMap, | ||
const Kokkos::View< const NumEntType *, DT > & | numRowEnt | ||
) |
Convert a (StaticProfile) CrsGraph's global column indices into local column indices.
lclColInds | [out] On output: The graph's local column indices. This may alias gblColInds, if LO == GO. |
gblColInds | [in] On input: The graph's global column indices. This may alias lclColInds, if LO == GO. |
ptr | [in] The graph's row offsets. |
lclColMap | [in] "Local" (threaded-kernel-worthy) version of the column Map. |
numRowEnt | [in] Array with number of entries in each row. |
Definition at line 147 of file Tpetra_CrsGraph_def.hpp.
OffsetsViewType::non_const_value_type Tpetra::Details::computeOffsetsFromCounts | ( | const OffsetsViewType & | ptr, |
const CountsViewType & | counts | ||
) |
Compute offsets from counts.
Compute offsets from counts via prefix sum:
ptr[i+1] = {j=0}^{i} counts[j]
Thus, ptr[i+1] - ptr[i] = counts[i], so that ptr[i+1] = ptr[i] + counts[i]. If we stored counts[i] in ptr[i+1] on input, then the formula is ptr[i+1] += ptr[i].
ptr
.OffsetsViewType | Type of the Kokkos::View specialization used to store the offsets; the output array of this function. |
CountsViewType | Type of the Kokkos::View specialization used to store the counts; the input array of this function. |
SizeType | The parallel loop index type; a built-in integer type. Defaults to the type of the input View's dimension. You may use a shorter type to improve performance. |
The type of each entry of the ptr
array must be able to store the sum of all the entries of counts
. This functor makes no attempt to check for overflow in this sum.
Definition at line 278 of file Tpetra_Details_computeOffsets.hpp.
OffsetsViewType::non_const_value_type Tpetra::Details::computeOffsetsFromConstantCount | ( | const OffsetsViewType & | ptr, |
const CountType & | count | ||
) |
Compute offsets from a constant count.
Compute offsets from a constant count via prefix sum:
ptr[i+1] = {j=0}^{i} count
Thus, ptr[i+1] - ptr[i] = count, so that ptr[i+1] = ptr[i] + count.
ptr
.OffsetsViewType | Type of the Kokkos::View specialization used to store the offsets; the output array of this function. |
CountType | Type of the constant count; the input argument of this function. |
SizeType | The parallel loop index type; a built-in integer type. Defaults to the type of the output View's dimension. You may use a shorter type to improve performance. |
The type of each entry of the ptr
array must be able to store ptr.dimension_0 () * count
. This functor makes no attempt to check for overflow in this sum.
Definition at line 407 of file Tpetra_Details_computeOffsets.hpp.
void Tpetra::Details::copyOffsets | ( | const OutputViewType & | dst, |
const InputViewType & | src | ||
) |
Copy row offsets (in a sparse graph or matrix) from src to dst. The offsets may have different types.
The implementation reserves the right to do bounds checking if the offsets in the two arrays have different types.
Everything above is an implementation detail of this function, copyOffsets. This function in turn is an implementation detail of FixedHashTable, in particular of the "copy constructor" that copies a FixedHashTable from one Kokkos device to another. copyOffsets copies the array of offsets (ptr_).
Definition at line 399 of file Tpetra_Details_copyOffsets.hpp.
void Tpetra::Details::gathervPrint | ( | std::ostream & | out, |
const std::string & | s, | ||
const Teuchos::Comm< int > & | comm | ||
) |
On Process 0 in the given communicator, print strings from each process in that communicator, in rank order.
For each process in the given communicator comm
, send its string s
to Process 0 in that communicator. Process 0 prints the strings in rank order.
This is a collective over the given communicator comm
. Process 0 promises not to store all the strings in its memory. This function's total memory usage on any process is proportional to the calling process' string length, plus the max string length over any process. This does NOT depend on the number of processes in the communicator. Thus, we call this a "memory-scalable" operation. While the function's name suggests MPI_Gatherv, the implementation may NOT use MPI_Gather or MPI_Gatherv, because neither of those are not memory scalable.
Process 0 prints nothing other than what is in the string. It does not add an endline after each string, nor does it identify each string with its owning process' rank. If you want either of those in the string, you have to put it there yourself.
out | [out] The output stream to which to write. ONLY Process 0 in the given communicator will write to this. Thus, this stream need only be valid on Process 0. |
s | [in] The string to write. Each process in the given communicator has its own string. Strings may be different on different processes. Zero-length strings are OK. |
comm | [in] The communicator over which this operation is a collective. |
Definition at line 52 of file Tpetra_Details_gathervPrint.cpp.
|
static |
Given a locally indexed, local sparse matrix, and corresponding local row and column Maps, extract the matrix's diagonal entries into a 1-D Kokkos::View.
This function implements much of the one-argument overload of Tpetra::CrsMatrix::getLocalDiagCopy, for the case where the matrix is fill complete. The function computes offsets of diagonal entries inline, and does not store them. If you want to store the offsets, call computeOffsets() instead.
DiagType | 1-D nonconst Kokkos::View |
CrsMatrixType | Specialization of KokkosSparse::CrsMatrix |
LocalMapType | Specialization of Tpetra::Details::LocalMap; type of the "local" part of a Tpetra::Map |
D | [out] 1-D Kokkos::View to which to write the diagonal entries. |
rowMap | [in] "Local" part of the sparse matrix's row Map. |
colMap | [in] "Local" part of the sparse matrix's column Map. |
A | [in] The sparse matrix. |
Definition at line 182 of file Tpetra_Details_getDiagCopyWithoutOffsets_decl.hpp.
LO Tpetra::Details::getLocalDiagCopyWithoutOffsetsNotFillComplete | ( | ::Tpetra::Vector< SC, LO, GO, NT > & | diag, |
const ::Tpetra::RowMatrix< SC, LO, GO, NT > & | A, | ||
const bool | debug = false |
||
) |
Given a locally indexed, global sparse matrix, extract the matrix's diagonal entries into a Tpetra::Vector.
This function is a work-around for Github Issue #499. It implements one-argument Tpetra::CrsMatrix::getLocalDiagCopy for the case where the matrix is not fill complete. The function computes offsets of diagonal entries inline, and does not store them. If you want to store the offsets, call computeOffsets() instead.
SC | Same as first template parameter (Scalar) of Tpetra::CrsMatrix and Tpetra::Vector. |
LO | Same as second template parameter (LocalOrdinal) of Tpetra::CrsMatrix and Tpetra::Vector. |
GO | Same as third template parameter (GlobalOrdinal) of Tpetra::CrsMatrix and Tpetra::Vector. |
NT | Same as fourth template parameter (Node) of Tpetra::CrsMatrix and Tpetra::Vector. |
diag | [out] Tpetra::Vector to which to write the diagonal entries. Its Map must be the same (in the sense of Tpetra::Map::isSameAs()) as the row Map of A . |
A | [in] The sparse matrix. Must be a Tpetra::RowMatrix (the base class of Tpetra::CrsMatrix), must be locally indexed, and must have row views. |
debug | [in] Whether to do extra run-time checks. This costs MPI communication. The default is false in a release build, and true in a debug build. |
We pass in the sparse matrix as a Tpetra::RowMatrix because the implementation of Tpetra::CrsMatrix uses this function, and we want to avoid a circular header dependency. On the other hand, the implementation does not actually depend on Tpetra::CrsMatrix.
Definition at line 192 of file Tpetra_Details_getDiagCopyWithoutOffsets_def.hpp.
MapType Tpetra::Details::makeOptimizedColMap | ( | std::ostream & | errStream, |
bool & | lclErr, | ||
const MapType & | domMap, | ||
const MapType & | colMap | ||
) |
Return an optimized reordering of the given column Map.
MapType | A specialization of Map. |
err | [out] Output stream for human-readable error reporting. This is local to the calling process and may differ on different processes. |
lclErr | [out] On output: true if anything went wrong on the calling process. This value is local to the calling process and may differ on different processes. |
domMap | [in] Domain Map of a CrsGraph or CrsMatrix. |
colMap | [in] Original column Map of the same CrsGraph or CrsMatrix as domMap . |
newColMap
.This is a convenience wrapper for makeOptimizedColMapAndImport(). (Please refer to that function's documentation in this file.) It does everything that that function does, except that it does not compute a new Import.
Definition at line 310 of file Tpetra_Details_makeOptimizedColMap.hpp.
std::pair<MapType, Teuchos::RCP<typename OptColMap<MapType>::import_type> > Tpetra::Details::makeOptimizedColMapAndImport | ( | std::ostream & | errStream, |
bool & | lclErr, | ||
const MapType & | domMap, | ||
const MapType & | colMap, | ||
const typename OptColMap< MapType >::import_type * | oldImport, | ||
const bool | makeImport | ||
) |
Return an optimized reordering of the given column Map. Optionally, recompute an Import from the input domain Map to the new column Map.
MapType | A specialization of Map. |
This function takes a domain Map and a column Map of a distributed graph (Tpetra::CrsGraph) or matrix (e.g., Tpetra::CrsMatrix). It then creates a new column Map, which optimizes the performance of an Import operation from the domain Map to the new column Map. This function also optionally creates that Import. Creating the new column Map and its Import at the same time saves some communication, since making the Import requires some of the same information that optimizing the column Map does.
err | [out] Output stream for human-readable error reporting. This is local to the calling process and may differ on different processes. |
lclErr | [out] On output: true if anything went wrong on the calling process. This value is local to the calling process and may differ on different processes. |
domMap | [in] Domain Map of a CrsGraph or CrsMatrix. |
colMap | [in] Original column Map of the same CrsGraph or CrsMatrix as domMap . |
makeImport | [in] Whether to make and return an Import from the input domain Map to the new column Map. |
newColMap
, and the corresponding Import from domMap
to newColMap
. The latter is nonnull if and only if makeImport
is true.domMap
and colMap
must have the same or congruent communicators. colMap
must be a subset of the indices in domMap
.The returned column Map's global indices (GIDs) will have the following order on all calling processes:
colMap
and domMap
(on the calling process) go first. colMap
on the calling process, but not in the domain Map on the calling process, follow. They are ordered first contiguously by their owning process rank (in the domain Map), then in increasing order within that. This imitates the ordering used by AztecOO and Epetra. Storing indices contiguously that are owned by the same process (in the domain Map) permits the use of contiguous send and receive buffers in Distributor, which is used in an Import operation.
Definition at line 383 of file Tpetra_Details_makeOptimizedColMap.hpp.
IndexType Tpetra::Details::countMergeUnsortedIndices | ( | const OrdinalType | curInds[], |
const IndexType | numCurInds, | ||
const OrdinalType | inputInds[], | ||
const IndexType | numInputInds | ||
) |
Count the number of column indices that can be merged into the current row, assuming that both the current row's indices and the input indices are unsorted.
Neither the current row's entries, nor the input, are sorted. Return the number of input entries that can be merged into the current row. Don't actually merge them. 'numCurInds' corresponds to 'midPos' in mergeUnsortedIndices.
The current indices are NOT allowed to have repeats, but the input indices ARE allowed to have repeats. (The whole point of these methods is to keep the current entries without repeats – "merged in.") Repeats in the input are counted separately with respect to merges.
The unsorted case is bad for asymptotics, but the asymptotics only show up with dense or nearly dense rows, which are bad for other reasons.
Definition at line 74 of file Tpetra_Details_Merge.hpp.
IndexType Tpetra::Details::countMergeSortedIndices | ( | const OrdinalType | curInds[], |
const IndexType | numCurInds, | ||
const OrdinalType | inputInds[], | ||
const IndexType | numInputInds | ||
) |
Count the number of column indices that can be merged into the current row, assuming that both the current row's indices and the input indices are sorted.
Both the current row's entries and the input are sorted. Return the number of input entries that can be merged into the current row. Don't actually merge them. 'numCurInds' corresponds to 'midPos' in mergeSortedIndices.
The current indices are NOT allowed to have repeats, but the input indices ARE allowed to have repeats. (The whole point of these methods is to keep the current entries without repeats – "merged in.") Repeats in the input are counted separately with respect to merges.
The sorted case is good for asymptotics, but imposes an order on the entries of each row. Sometimes users don't want that.
Definition at line 133 of file Tpetra_Details_Merge.hpp.
std::pair<bool, IndexType> Tpetra::Details::mergeSortedIndices | ( | OrdinalType | curInds[], |
const IndexType | midPos, | ||
const IndexType | endPos, | ||
const OrdinalType | inputInds[], | ||
const IndexType | numInputInds | ||
) |
Attempt to merge the input indices into the current row's column indices, assuming that both the current row's indices and the input indices are sorted.
Both the current row's entries and the input are sorted. If and only if the current row has enough space for the input (after merging), merge the input with the current row.
Assume that both curInds and inputInds are sorted. Current indices: curInds[0 .. midPos-1]. Extra space at end: curInds[midPos .. endPos-1] Input indices to merge in: inputInds[0 .. numInputInds]. Any of those could be empty.
If the merge succeeded, return true and the new number of entries in the row. Else, return false and the new number of entries in the row required to fit the input.
The sorted case is good for asymptotics, but imposes an order on the entries of each row. Sometimes users don't want that.
Definition at line 205 of file Tpetra_Details_Merge.hpp.
std::pair<bool, IndexType> Tpetra::Details::mergeUnsortedIndices | ( | OrdinalType | curInds[], |
const IndexType | midPos, | ||
const IndexType | endPos, | ||
const OrdinalType | inputInds[], | ||
const IndexType | numInputInds | ||
) |
Attempt to merge the input indices into the current row's column indices, assuming that both the current row's indices and the input indices are unsorted.
Neither the current row's entries nor the input are sorted. If and only if the current row has enough space for the input (after merging), merge the input with the current row.
Assume that neither curInds nor inputInds are sorted. Current indices: curInds[0 .. midPos-1]. Extra space at end: curInds[midPos .. endPos-1] Input indices to merge in: inputInds[0 .. numInputInds]. Any of those could be empty.
If the merge succeeded, return true and the new number of entries in the row. Else, return false and the new number of entries in the row required to fit the input.
The unsorted case is bad for asymptotics, but the asymptotics only show up with dense or nearly dense rows, which are bad for other reasons.
Definition at line 325 of file Tpetra_Details_Merge.hpp.
std::pair<bool, IndexType> Tpetra::Details::mergeUnsortedIndicesAndValues | ( | OrdinalType | curInds[], |
ValueType | curVals[], | ||
const IndexType | midPos, | ||
const IndexType | endPos, | ||
const OrdinalType | inputInds[], | ||
const ValueType | inputVals[], | ||
const IndexType | numInputInds | ||
) |
Attempt to merge the input indices and values into the current row's column indices and corresponding values, assuming that both the current row's indices and the input indices are unsorted.
Neither the current row's entries nor the input are sorted. If and only if the current row has enough space for the input (after merging), merge the input with the current row.
Assume that neither curInds nor inputInds are sorted. Current indices: curInds[0 .. midPos-1]. Current values: curVals[0 .. midPos-1]. Extra space for indices at end: curInds[midPos .. endPos-1]. Extra space for values at end: curVals[midPos .. endPos-1]. Input indices to merge in: inputInds[0 .. numInputInds]. Input values to merge in: inputVals[0 .. numInputInds].
If the merge succeeded, return true and the new number of entries in the row. Else, return false and the new number of entries in the row required to fit the input.
The unsorted case is bad for asymptotics, but the asymptotics only show up with dense or nearly dense rows, which are bad for other reasons.
Definition at line 419 of file Tpetra_Details_Merge.hpp.
std::string Tpetra::Details::DistributorSendTypeEnumToString | ( | EDistributorSendType | sendType | ) |
Convert an EDistributorSendType enum value to a string.
This is an implementation detail of Distributor. Please do not rely on this function in your code.
Definition at line 49 of file Tpetra_Distributor.cpp.
std::string Tpetra::Details::DistributorHowInitializedEnumToString | ( | EDistributorHowInitialized | how | ) |
Convert an EDistributorHowInitialized enum value to a string.
This is an implementation detail of Distributor. Please do not rely on this function in your code.
Definition at line 70 of file Tpetra_Distributor.cpp.
bool Tpetra::Details::isLocallyFitted | ( | const Tpetra::Map< LocalOrdinal, GlobalOrdinal, Node > & | map1, |
const Tpetra::Map< LocalOrdinal, GlobalOrdinal, Node > & | map2 | ||
) |
Is map1 locally fitted to map2?
For Map instances map1 and map2, we say that map1 is locally fitted to map2 (on the calling process), when the initial indices of map1 (on the calling process) are the same and in the same order as those of map2. "Fittedness" is entirely a local (per MPI process) property.
The predicate "is map1 fitted to map2 ?" is not symmetric. For example, map2 may have more entries than map1.
Fittedness on a process can let Tpetra avoid deep copies in some Export or Import (communication) operations. Tpetra could use this, for example, in optimizing its sparse matrix-vector multiply.
Definition at line 1925 of file Tpetra_Map_def.hpp.
bool Tpetra::Details::congruent | ( | const Teuchos::Comm< int > & | comm1, |
const Teuchos::Comm< int > & | comm2 | ||
) |
Whether the two communicators are congruent.
Two communicators are congruent when they have the same number of processes, and those processes occur in the same rank order.
If both communicators are MpiComm instances, this function returns true
exactly when MPI_Comm_compare
returns MPI_IDENT
(the communicators are handles for the same object) or MPI_CONGRUENT
. SerialComm instances are always congruent. An MpiComm is congruent to a SerialComm if the MpiComm has only one process. This function is symmetric in its arguments.
If either Comm instance is neither an MpiComm nor a SerialComm, this method cannot do any better than to compare their process counts.
Two communicators are congruent when they have the same number of processes, and those processes occur in the same rank order.
If both communicators are Teuchos::MpiComm instances, this function returns true
exactly when MPI_Comm_compare
returns MPI_IDENT
(the communicators are handles for the same object) or MPI_CONGRUENT
on their MPI_Comm handles. Any two Teuchos::SerialComm instances are always congruent. An MpiComm instance is congruent to a SerialComm instance if and only if the MpiComm has one process. This function is symmetric in its arguments.
If either Teuchos::Comm instance is neither an MpiComm nor a SerialComm, this method cannot do any better than to compare their process counts.
Definition at line 65 of file Tpetra_Util.cpp.
Teuchos::ArrayView<typename DualViewType::t_dev::value_type> Tpetra::Details::getArrayViewFromDualView | ( | const DualViewType & | x | ) |
Get a Teuchos::ArrayView which views the host Kokkos::View of the input 1-D Kokkos::DualView.
x | [in] A specialization of Kokkos::DualView. |
Definition at line 875 of file Tpetra_Util.hpp.
Kokkos::DualView<T*, DT> Tpetra::Details::getDualViewCopyFromArrayView | ( | const Teuchos::ArrayView< const T > & | x_av, |
const char | label[], | ||
const bool | leaveOnHost | ||
) |
Get a 1-D Kokkos::DualView which is a deep copy of the input Teuchos::ArrayView (which views host memory).
T | The type of the entries of the input Teuchos::ArrayView. |
DT | The Kokkos Device type. |
x_av | [in] The Teuchos::ArrayView to copy. |
label | [in] String label for the Kokkos::DualView. |
leaveOnHost | [in] If true, the host version of the returned Kokkos::DualView is most recently updated (and the DualView may need a sync to device). If false, the device version is most recently updated (and the DualView may need a sync to host). |
Definition at line 909 of file Tpetra_Util.hpp.