CONTENTS Title Page Copyright Page Preface 1 Using the Level 1 BLAS Subprograms and Extensions 1.1 Level 1 BLAS Operations 1.1.1 Types of Operations 1.1.2 Vector Storage 1.1.3 Accuracy 1.1.4 Naming Conventions 1.2 Summary of Level 1 BLAS Subprograms 1.3 Calling Subprograms 1.4 Argument Conventions 1.5 Definition of Absolute Value 1.6 A Look at a Level 1 Extensions Subprogram Level 1 BLAS Subprograms ISAMAX IDAMAX ICAMAX IZAMAX SASUM DASUM SCASUM DZASUM SAXPY DAXPY CAXPY ZAXPY SCOPY DCOPY CCOPY ZCOPY SDOT DDOT DSDOT CDOTC ZDOTC CDOTU ZDOTU SDSDOT SNRM2 DNRM2 SCNRM2 DZNRM2 SROT DROT CROT ZROT CSROT ZDROT SROTG DROTG CROTG ZROTG SROTM DROTM SROTMG DROTMG SSCAL DSCAL CSCAL ZSCAL, CSSCAL ZDSCAL SSWAP DSWAP CSWAP ZSWAP Level 1 BLAS Extensions Subprograms ISAMIN IDAMIN ICAMIN IZAMIN ISMAX IDMAX ISMIN IDMIN SAMAX DAMAX SCAMAX DZAMAX SAMIN DAMIN SCAMIN DZAMIN SMAX DMAX SMIN DMIN SNORM2 DNORM2 SCNORM2 DZNORM2 SNRSQ DNRSQ SCNRSQ DZNRSQ SSET DSET CSET ZSET SSUM DSUM CSUM ZSUM SVCAL DVCAL CVCAL ZVCAL CSVCAL, ZDVCAL SZAXPY DZAXPY CZAXPY ZZAXPY 2 Using the Sparse Level 1 BLAS Subprograms 2.1 Sparse Level 1 BLAS Operations 2.1.1 Types of Operations 2.1.2 Sparse Vector Storage 2.1.3 Vector and Scalar Accuracy 2.1.4 Naming Conventions 2.2 Summary of Sparse Level 1 BLAS Subprograms 2.3 Calling Subprograms 2.4 Argument Conventions 2.4.1 Defining the Number of Nonzero Elements 2.4.2 Defining the Input Scalar 2.4.3 Describing the Input/Output Vectors 2.5 A Look At a Sparse Level 1 BLAS Subprogram Sparse Level 1 BLAS Subprograms SAXPYI DAXPYI CAXPYI ZAXPYI SDOTI DDOTI CDOTUI ZDOTUI CDOTCI ZDOTCI SGTHR DGTHR CGTHR ZGTHR SGTHRS DGTHRS CGTHRS ZGTHRS SGTHRZ DGTHRZ CGTHRZ ZGTHRZ SROTI DROTI SSCTR DSCTR CSCTR ZSCTR SSCTRS DSCTRS CSCTRS ZSCTRS SSUMI DSUMI CSUMI ZSUMI 3 Using the Level 2 BLAS Subprograms 3.1 Level 2 BLAS Operations 3.1.1 Types of Operations 3.1.2 Vector and Matrix Storage 3.1.3 Naming Conventions for Level 2 BLAS Subprograms 3.2 Summary of Level 2 BLAS Subprograms 3.3 Calling Subprograms 3.4 Argument Conventions 3.4.1 Specifying Matrix Options 3.4.2 Defining the Size of the Matrix 3.4.3 Describing the Matrix 3.4.4 Describing the Input Scalars 3.4.5 Describing the Vectors 3.4.6 Invalid Arguments 3.5 Rank-One and Rank-Two Updates to Band Matrices 3.6 A Look at a Level 2 BLAS Subroutine Level 2 BLAS Subprograms SGBMV DGBMV CGBMV ZGBMV SGEMV DGEMV CGEMV ZGEMV SGER DGER CGERC ZGERC CGERU ZGERU SSBMV DSBMV CHBMV ZHBMV SSPMV DSPMV CHPMV ZHPMV SSPR DSPR CHPR ZHPR SSPR2 DSPR2 CHPR2 ZHPR2 SSYMV DSYMV CHEMV ZHEMV SSYR DSYR CHER ZHER SSYR2 DSYR2 CHER2 ZHER2 STBMV DTBMV CTBMV ZTBMV STBSV DTBSV CTBSV ZTBSV STPMV DTPMV CTPMV ZTPMV STPSV DTPSV CTPSV ZTPSV STRMV DTRMV CTRMV ZTRMV STRSV DTRSV CTRSV ZTRSV 4 Using the Level 3 BLAS Subprograms 4.1 Level 3 BLAS Operations 4.1.1 Types of Operations 4.1.2 Matrix Storage 4.1.3 Naming Conventions 4.2 Summary of Level 3 BLAS Subprograms 4.3 Calling the Subprograms 4.4 Argument Conventions 4.4.1 Specifying Matrix Options 4.4.2 Defining the Size of the Matrices 4.4.3 Describing the Matrices 4.4.4 Specifying the Input Scalar 4.4.5 Invalid Arguments 4.5 A Look at a Level 3 BLAS Subroutine Level 3 BLAS Subroutines SGEMA DGEMA CGEMA ZGEMA SGEMM DGEMM CGEMM ZGEMM SGEMS DGEMS CGEMS ZGEMS SGEMT DGEMT CGEMT ZGEMT SSYMM DSYMM CSYMM ZSYMM CHEMM ZHEMM SSYRK DSYRK CSYRK ZSYRK CHERK, ZHERK SSYR2K DSYR2K CSYR2K ZSYR2K CHER2K, ZHER2K STRMM DTRMM CTRMM ZTRMM STRSM DTRSM CTRSM ZTRSM 5 Using the Signal Processing Subprograms 5.1 Fourier Transform 5.1.1 Mathematical Definition of FFT 5.1.1.1 One-Dimensional Continuous Fourier Transform 5.1.1.2 One-Dimensional Discrete Fourier Transform 5.1.1.3 Two-Dimensional Discrete Fourier Transform 5.1.1.4 Three-Dimensional Discrete Fourier Transform 5.1.1.5 Size of Fourier Transform 5.1.2 Data Storage 5.1.2.1 Storing the Fourier Coefficient of 1D-FFT 5.1.2.2 Storing the Fourier Coefficient of 2D-FFT 5.1.2.3 Storing the Fourier Coefficient of 3D-FFT 5.1.2.4 Storing the Fourier Coefficient of Group FFT 5.1.3 DXML's FFT Functions 5.1.3.1 Choosing Data Lengths 5.1.3.2 Input and Output Data Format 5.1.3.3 Using the Internal Data Structures 5.1.3.4 Naming Conventions 5.1.3.5 Summary of Fourier Transform Functions 5.2 Convolution and Correlation 5.2.1 Mathematical Definitions of Correlation and Convolution 5.2.1.1 Definition of the Discrete Nonperiodic Convolution 5.2.1.2 Definition of the Discrete Nonperiodic Correlation 5.2.1.3 Periodic Convolution and Correlation 5.2.2 DXML's Convolution and Correlation Subroutines 5.2.2.1 Using FFT Methods for Convolution and Correlation 5.2.2.2 Naming Conventions 5.2.2.3 Summary of Convolution and Correlation Subroutines 5.3 Digital Filtering 5.3.1 Mathematical Definition of the Nonrecursive Filter 5.3.2 Controlling Filter Type 5.3.3 Controlling Filter Sharpness and Smoothness 5.3.4 DXML's Digital Filter Subroutines 5.3.4.1 Naming Conventions 5.3.4.2 Summary of Digital Filter Subroutines Fast Fourier Transforms _FFT _FFT_INIT _FFT_APPLY _FFT_EXIT _FFT_2D _FFT_INIT_2D _FFT_APPLY_2D _FFT_EXIT_2D _FFT_3D _FFT_INIT_3D _FFT_APPLY_3D _FFT_EXIT_3D _FFT_GRP _FFT_INIT_GRP _FFT_APPLY_GRP _FFT_EXIT_GRP Convolutions and Correlations _CONV_NONPERIODIC _CONV_PERIODIC _CORR_NONPERIODIC _CORR_PERIODIC _CONV_NONPERIODIC_EXT _CONV_PERIODIC_EXT _CORR_NONPERIODIC_EXT _CORR_PERIODIC_EXT Filters SFILTER_NONREC SFILTER_INIT_NONREC SFILTER_APPLY_NONREC 6 Using the Iterative Solvers for Sparse Linear Systems 6.1 Introduction 6.2 Interface to the Iterative Solver 6.2.1 Matrix-Vector Product 6.2.2 Preconditioning 6.2.3 Stopping Criterion 6.2.4 Parameters for the Iterative Solver 6.2.5 Argument List for the Iterative Solver 6.3 Matrix Operations 6.3.1 Storage Schemes for Sparse Matrices 6.3.1.1 SDIA: Symmetric diagonal storage scheme 6.3.1.2 UDIA: Unsymmetric diagonal storage scheme 6.3.1.3 GENR: General storage scheme by rows 6.3.2 Types of Preconditioners 6.3.2.1 DIAG: Diagonal preconditioner 6.3.2.2 POLY: Polynomial preconditioner 6.3.2.3 ILU: Incomplete LU preconditioner 6.4 Iterative Solvers 6.5 Naming Conventions 6.6 Summary of Iterative Solver Subroutines 6.7 Hints on the Use of the Iterative Solver 6.8 A Look at Some Iterative Solvers Sparse Iterative Solver Subprograms DEFAULTS DPCG DPLSCG DPBCG DPCGS DPGMRES DMATVEC_SDIA DMATVEC_UDIA DMATVEC_GENR DCREATE_DIAG_SDIA DCREATE_DIAG_UDIA DCREATE_DIAG_GENR DCREATE_POLY_SDIA DCREATE_POLY_UDIA DCREATE_POLY_GENR DCREATE_ILU_SDIA DCREATE_ILU_UDIA DCREATE_ILU_GENR DAPPLY_DIAG_ALL DAPPLY_POLY_SDIA DAPPLY_POLY_UDIA DAPPLY_POLY_GENR DAPPLY_ILU_SDIA DAPPLY_ILU_UDIA_L DAPPLY_ILU_UDIA_U DAPPLY_ILU_GENR_L DAPPLY_ILU_GENR_U 7 Using LAPACK Subprograms 7.1 Overview 7.2 Naming Conventions 7.3 Example of LAPACK Use and Design 7.4 Performance Tuning A Bibliography A.1 Level 1 BLAS A.2 Sparse Level 1 BLAS A.3 Level 2 BLAS A.4 Level 3 BLAS A.5 Signal Processing A.6 Iterative Solvers EXAMPLES 6-1 DPCG with User-Defined Routines and Storage 6-2 DPCG with POLY Preconditioning for SDIA Storage 6-3 DPLSCG with ILU Preconditioning for GENR Storage 6-4 DPGMRES with ILU Preconditioning for UDIA Storage 7-1 ILAENV 7-2 XLAENV FIGURES 5-1 Digital Filter Transfer Function Forms 5-2 Lowpass Nonrecursive Filter for Varying Nterms 5-3 Lowpass Nonrecursive Filter for Varying Wiggles TABLES 1 Documentation Conventions 2 DXML Font Usage 3 DXML Symbols and Expressions 1-1 Naming Conventions: Level 1 BLAS Subprograms 1-2 Summary of Level 1 BLAS Subprograms 1-3 Summary of Extensions to Level 1 BLAS Subprograms 2-1 Naming Conventions: Sparse Level 1 BLAS Subprogram 2-2 Summary of Level 1 Sparse BLAS Subprograms 3-1 Naming Conventions: Level 2 BLAS Subprograms 3-2 Summary of Level 2 BLAS Subprograms 3-3 Values for the Argument TRANS 3-4 Values for the Argument UPLO 3-5 Values for the Argument DIAG 4-1 Naming Conventions: Level 3 BLAS Subprograms 4-2 Summary of Level 3 BLAS Subprograms 4-3 Values for the Argument SIDE 4-4 Values for the Arguments TRANSA and TRANSB 4-5 Values for the Argument UPLO 4-6 Values for the Argument DIAG 5-1 FFT Size 5-2 Size of Output Array for SFFT and DFFT 5-3 Size of Output Array from CFFT and ZFFT 5-4 Input and Output Format Argument Values 5-5 Status Values for Unsupported Input and Output Combinations 5-6 Naming Conventions: Fourier Transform Functions 5-7 Summary of One-Step Fourier Transform Functions 5-8 Summary of Three-Step Fourier Transform Functions 5-9 Naming Conventions: Convolution and Correlation Subroutines 5-10 Summary of Convolution Subroutines 5-11 Summary of Correlation Subroutines 5-12 Controlling Filtering Type 5-13 Naming Conventions: Digital Filter Subroutines 5-14 Summary of Digital Filter Subroutines 6-1 Parameters for the MATVEC Subroutine 6-2 Parameters for the PCONDR and PCONDL Subroutines 6-3 Parameters for the MSTOP Subroutine 6-4 Integer Parameters for the Iterative Solver 6-5 Real Parameters for the Iterative Solver 6-6 Default Values for Parameters 6-7 Parameters for the SOLVER Subroutine 6-8 Preconditioners for the Iterative Methods 6-9 Naming Conventions: Iterative Solver Routines 6-10 Summary of Iterative Solver Routines 6-11 Summary of Matrix-Vector Product Routines 6-12 Summary of Preconditioner Creation Routines 6-13 Summary of Preconditioner Application Routines 7-1 Naming Conventions: Mnemonics for MM 7-2 Naming Conventions: Mnemonics for FF 7-3 Driver Routines 7-4 Generalized Eigenvalue Routines