vRealize Automation 8.x - Troubleshooting

Nov 23, 2019 · 4 min read · Automation Troubleshoot VMware vRA vRealize ·

With the introduction of vRealize Automation (vRA) 8.0, the traditional appliance VAMI page is gone. This is replaced with the vRA CLI and the kubernetes command line tools. This post will show some of the more common CLI commands you may need. To use all of the commands below, use SSH to connect to the appliance and log in with the root username and password.

Check Pods / 'Services' Status

Although the traditional vRA services are replaced with kubernetes containers, you can still check the running status of them using the command below. This command will show the running status, the age and the number of restarts for each pod or 'service'.

1kubectl -n prelude get pods

Display vRA Cluster Status

1vracli status

Verify the vRA Deployment Status

The output of this command will be "Deployment not complete" if the appliance is still deploying / starting up, otherwise it will show as "Deployment complete".

1vracli status deploy

Check Deployment Log File

The deployment log file is located at the below location

1tail -f /var/log/deploy.log

Generate a Log Bundle

The command below will generate a log bundle and the output file can be found at \root\log-bundle-xxxxxxxxx.tar.xz. For my environment, the log bundle took around 20 minutes to complete and was 60MB in size, however HA environments are likely to take longer and be significantly larger. The --collector-timeout flag can be used to set a timeout for each log collection (default 1000 seconds). The --include-cold-storage may be requested by GSS if the issue you are troubleshooting was not recent as it will include older log files in the log-bundle, however collection will be slower and the output file will be larger.

1vracli log-bundle

Stopping / Shut down vRA Cluster

This command will shutdown vRealize Automation on all of the cluster nodes by stopping the services, sleep for 2 minutes and clean the current deployment before shutting down the appliance. Check the official docs here for up-to-date procedures.

1/opt/scripts/svc-stop.sh
2sleep 120
3/opt/scripts/deploy.sh --onlyClean
4shutdown -h now

Starting vRA Cluster

Power on each of the appliances and wait for them to boot completely before proceeding. Wait for the appliance console to show the blue welcome page. Ensure that all prerequisite servers are also started such as vRealize Identity Manager (vIDM). This command will run the deploy.sh script to deploy all prelude services and then the kubectl command will show the status of all the running pods or 'services'. This process can take 20+ minutes. If the appliance has insufficient memory, the timeout will occur at 30 minutes. Check the official docs here for up-to-date procedures.

1/opt/scripts/deploy.sh
2kubectl -n prelude get pods

vRA 8.x Error - Bad Gateway

After starting up your vRA appliances, you may find that the UI loads but shows an error of Bad Gateway. This is usually because the appliance is still starting up. Presuming the appliance has enough resources assigned to it, the UI will eventually load and as per above, the status of the deployment can be checked using the below command. Check the READY column and confirm that all pods are ready for use. Any pod with a READY value of 0/1 means that the pod is not available yet. Once all pods are listed as 1/1 or 2/2 then the UI will be available for use.

1kubectl -n prelude get pods

vRA just not working...

After trying all of the above, sometimes vRA just won't come back online after a failure. If this is the case, run the command above to check the status of the pods and if they are all online except the postgres database pod, try the below command to restart the kubelet service. Once this is run, let it sit for the next 30 minutes as vRA will restart itself and try to come back online cleanly.

1systemctl restart kubelet

Remove VM from Inventory without deleting the VM

Whilst this is 100% not supported, vautomation.dev provides a very useful article on how to remove VMs from inventory without deleting the underlying VM by accessing the internal vRA database.