CONTENTS Title Page Copyright Page Preface 1 Introduction to VAXcluster Systems 1.1 Shared Resources 1.1.1 Disk Storage 1.1.2 Batch and Print Job Processing 1.2 Interconnect Devices 1.3 Software Components 1.4 Configuration Types 1.4.1 CI-Based VAXcluster Systems 1.4.2 Local Area VAXcluster Systems 1.4.2.1 High-Availablity Configurations with Multiple System Disks 1.4.2.2 High-Availability Dual-Host Configurations 1.4.3 Mixed-Interconnect VAXcluster Systems 1.4.4 Security for Local Area and Mixed-Interconnect VAXcluster Systems 1.5 Connection Management 1.5.1 The Quorum Scheme 1.5.2 Quorum Disk 1.5.3 State Transitions 1.6 Configuration Planning 2 Preparing the Cluster Operating Environment 2.1 Directory Stucture on a Common System Disk 2.2 Installing the VMS Operating System in the VAXcluster Environment 2.3 Configuring and Starting the DECnet-VAX Network 2.3.1 Copying Remote Node Databases 2.3.2 Enabling VAXcluster Alias Operations 2.4 Coordinating Startup Command Procedures 2.4.1 Building Startup Procedures for a Common-Environment Cluster 2.4.1.1 Procedures for Existing Computers 2.4.1.2 Procedures for Newly Installed Computers 2.4.2 Building Startup Procedures for a Multiple-Environment Cluster 2.5 Coordinating System Files for a Common-Environment Cluster 2.5.1 Coordinating User Accounts 2.5.2 Preparing the Rights Database 2.5.3 Preparing the MAIL Database 2.5.4 Coordinating Shared System Files in Clusters with Multiple Common System Disks 3 Setting Up and Managing Cluster Disks 3.1 Cluster-Accessible Disks 3.1.1 HSC Disks 3.1.2 MSCP-Served Disks 3.1.2.1 MSCP Server Functions 3.1.2.2 MSCP Load Sharing 3.1.3 Dual-Pathed Disks 3.1.3.1 Dual-Ported HSC Disks 3.1.3.2 Dual-Pathed DSA Disks on Local UDA/KDA/KDB Controllers 3.1.3.3 DSSI-Connected ISAs 3.1.3.4 Dual-Ported MASSBUS Disks 3.2 Cluster Device-Naming Conventions 3.2.1 Rules for Specifying Allocation Class Values 3.2.2 Sample Configurations with Named Devices 3.3 Shared Disks 3.4 Configuring Cluster Disks 3.5 Rebuilding Cluster Disks 4 Setting Up and Managing Cluster Queues 4.1 Clusterwide Queues 4.2 Cluster Printer Queues 4.2.1 Setting Up Printer Queues 4.2.2 Setting Up Clusterwide Generic Printer Queues 4.3 Cluster Batch Queues 4.3.1 Setting Up Executor Batch Queues 4.3.2 Setting Up Clusterwide Generic Batch Queues 4.4 Using a Common Command Procedure to Set Up Cluster Queues 5 Building and Maintaining the Cluster 5.1 CLUSTER_CONFIG.COM Functions 5.2 Determining Locations and Sizes for Satellite Page and Swap Files 5.3 Selecting Boot and Disk Servers 5.4 Determining Allocation Class Values in Mixed-Interconnect Clusters 5.5 Configuring the Cluster 5.5.1 Adding a Computer to the Cluster 5.5.1.1 Updating Network Data After Adding a Satellite 5.5.1.2 Restoring a Satellite's Network Data 5.5.1.3 Controlling Clusterwide Broadcast Messages on Satellites and Boot Servers 5.5.2 Removing a Computer from the Cluster 5.5.3 Changing a Computer's Characteristics 5.5.4 Changing the Cluster Configuration Type 5.5.4.1 Changing an Existing CI-Only Cluster to a Mixed-Interconnect Configuration 5.5.4.2 Changing an Existing Local Area Cluster to a Mixed-Interconnect Configuration 5.5.5 Converting a Standalone Computer to a VAXcluster Computer 5.5.6 Creating a Duplicate System Disk 5.6 Reconfiguring the Cluster After a Major Change 5.6.1 Updating MODPARAMS.DAT Files to Adjust Cluster Quorum 5.6.2 Shutting Down the Cluster 5.6.3 Changing Allocation Class Values on HSC Subsystems 5.6.4 Rebooting the Cluster 5.7 Maintaining the Cluster 5.7.1 Running AUTOGEN with the FEEDBACK Option 5.7.2 Recording Configuration Data 5.7.3 Monitoring Ethernet Activity in Local Area and Mixed-Interconnect Clusters 5.7.4 Restoring Cluster Quorum After an Unexpected Computer Failure 5.7.5 Selecting Cluster Shutdown Options 5.7.5.1 The REMOVE_NODE Option 5.7.5.2 The CLUSTER_SHUTDOWN Option 5.7.5.3 The REBOOT_CHECK Option 5.7.5.4 The SAVE_FEEDBACK Option 5.7.6 Rebooting a Satellite with an Operating System on a Local Disk 5.7.7 Performing Security Functions in Local Area and Mixed-Interconnect Clusters 5.7.7.1 Maintaining Cluster Security Data 5.7.7.2 Controlling Conversational Bootstrap Operations for Satellites 5.8 Guidelines for Configuring Large Clusters 5.8.1 Configuring Disk Server Ethernet Adapters and Memory 5.8.2 Configuring System Disks 5.8.2.1 Concurrent User Activity 5.8.2.2 Concurrent Booting Activity 5.8.2.3 Boot Time Costs 5.8.2.4 Moving High-Activity Files off System Disks 5.8.2.5 Controlling Dump File Size and Creation 5.8.2.6 Sharing Dump Files 5.8.3 Adding Computers to an Existing Cluster 5.8.3.1 Running AUTOGEN with FEEDBACK for Initial Configuration 5.8.3.2 Creating a Command File to Run AUTOGEN with FEEDBACK 5.8.4 Setting Up a New Large VAXcluster System 5.8.5 Defining the VAXcluster Alias A Cluster SYSGEN Parameters B Building a Common SYSUAF.DAT File C Cluster Troubleshooting Information C.1 Diagnosing Failures of Computers to Boot or to Join the Cluster C.1.1 Summary of Events for Computers Booting and Joining the Cluster C.1.2 CI-Connected Computer Fails to Boot C.1.3 Satellite Fails to Boot C.1.4 Computer Fails to Join the Cluster C.1.5 Startup Procedures Fail to Complete C.2 Diagnosing Cluster Hangs C.2.1 Cluster Quorum Is Lost C.2.2 A Shared Cluster Resource Is Inaccessible C.3 Diagnosing CLUEXIT Bugchecks C.4 Diagnosing VAXport Device Problems C.4.1 VAXport Communication Mechanisms C.4.2 Port Failures C.4.2.1 Verifying CI Port Functions C.4.2.2 Verifying CI Cable Connections C.4.2.3 Repairing CI Cables C.4.3 Analyzing Error Log Entries for VAXport Devices C.4.3.1 Error Log Entry Formats C.4.3.2 Device-Attention Entries C.4.3.3 Logged-Message Entries C.4.3.4 Error Log Entry Descriptions C.4.4 OPA0 Error Messages EXAMPLES 2-1 Sample Interactive Network Configuration Session 4-1 Common Procedure to Set Up VAXcluster Queues 5-1 Sample Interactive CLUSTER_CONFIG.COM Session to Add a CI-Connected Computer as a Boot Server 5-2 Sample Interactive CLUSTER_CONFIG.COM Session to Add a Satellite with Local Page and Swap Files 5-3 Sample NETNODE_UPDATE.COM File 5-4 Sample Interactive CLUSTER_CONFIG.COM Session to Remove a Satellite with Local Page and Swap Files 5-5 Sample Interactive CLUSTER_CONFIG.COM Session to Enable the Local Computer as a Disk Server 5-6 Sample Interactive CLUSTER_CONFIG.COM Session to Change the Local Computer's ALLOCLASS Value 5-7 Sample Interactive CLUSTER_CONFIG.COM Session to Enable the Local Computer as a Boot Server 5-8 Sample Interactive CLUSTER_CONFIG.COM Session to Change a Satellite's Hardware Address 5-9 Sample Interactive CLUSTER_CONFIG.COM Session to Convert a Standalone Computer to a Cluster Boot Server 5-10 Sample Interactive CLUSTER_CONFIG.COM CREATE Session 5-11 Sample SYSMAN Session to Change the Cluster Password C-1 CI Device-Attention Entry C-2 Ethernet Device-Attention Entry C-3 CI Logged-Message Entry FIGURES 1-1 Typical CI-Based VAXcluster Configuration 1-2 Local Area VAXcluster System with Single Boot Server 1-3 High-Availability Local Area VAXcluster Configuration 1-4 Dual-Host VAXcluster Configuration 1-5 Typical Mixed-Interconnect VAXcluster System 2-1 Directory Structure on Common System Disk 2-2 File Search Order on Common System Disk 3-1 CI-Based Configuration with Shared Disks 3-2 Mixed-Interconnect VAXcluster Segment with Dual-Pathed HSC Disk 3-3 Mixed-Interconnect VAXcluster Segment with Dual-Pathed DSA Disk 3-4 Device Names in a Mixed-Interconnect Cluster 4-1 Sample Printer Configuration 4-2 Printer Queue Configuration 4-3 Clusterwide Generic Printer Queue Configuration 4-4 Sample Batch Queue Configuration 4-5 Clusterwide Generic Batch Queue Configuration C-1 A Correctly Connected Two-Computer CI Cluster C-2 Crossed CI Cable Pair TABLES 2-1 Information Requested for CI-based Configurations 2-2 Information Requested for Local Area and Mixed-Interconnect Configurations 3-1 Specifying Values for MSCP_LOAD and MSCP_SERVE_ALL Parameters 5-1 Summary of CLUSTER_CONFIG.COM Functions 5-2 Data Requested by CLUSTER_CONFIG.COM 5-3 CLUSTER_CONFIG.COM CHANGE Options 5-4 Summary of SYSMAN CONFIGURATION Commands for Cluster Authorization 5-5 System Disk I/O Activity and Boot Time for Single Satellite 5-6 System Disk I/O Activity and Boot Times for Multiple Satellites 5-7 AUTOGEN Dump File Symbols A-1 Cluster SYSGEN Parameters