A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://cloud.google.com/compute/docs/troubleshooting/troubleshooting-cpu-soft-lockup below:

Troubleshooting vCPU soft lockups | Compute Engine Documentation

Troubleshooting vCPU soft lockups

Stay organized with collections Save and categorize content based on your preferences.

Linux

This document describes how to troubleshoot vCPU soft lockups. A soft lockup occurs when a virtual machine (VM) instance's vCPU is unable to run a new task for more than 20 seconds. Most soft lockups are caused by bugs in application software.

Soft lockups can cause VMs to become unresponsive for short periods of time, disrupt SSH access to VMs, and trigger application timeouts or failover. VMs that are experiencing a soft lockup might also have unusually high or unusually low CPU utilization, depending on the exact cause of the soft lockup.

Identify soft lockups

To identify whether your VM is experiencing a soft lockup, do one of the following:

Example soft lockup stack trace

watchdog: BUG: soft lockup - CPU#3 stuck for 22s!

To detect future soft lockups, you can do the following:

  1. Enable serial port output logging.

  2. Create a log-based alerting policy for the following log:

    resource.type="gce_instance" log_id("serialconsole.googleapis.com/serial_port_1_output") textPayload=~"watchdog.*lockup"
    
    Note: When you test the query, it is likely that no logs appear. This is expected behavior.
Troubleshoot soft lockups

After you've identified that a soft lockup is occurring, try the following troubleshooting steps to resolve the issue:

  1. Check your OS vendor's site for known errors with your OS version. Sometimes you might find reference to specific kernel modules in the stack trace that suggests a particular function or operation that is involved.
  2. Identify whether the soft lockup repeats with any frequency, such as coinciding with high load or certain activities. If the soft lockups correlate with high load, you might need to reconfigure your workload, for example by using a larger VM or splitting the load across more VMs.
  3. Check if the soft lockups correlate with any changes to your runtime environment such as new software deployments or OS image updates.
  4. Evaluate whether any maintenance events have taken place around the time of the soft lockup, by reviewing audit logs for system event audit logs.

If the proceeding troubleshooting steps didn't resolve the issue, file a support case and include all of the information you gathered from troubleshooting.

Best practices to avoid soft lockups

To help prevent your VMs from experiencing soft lockups, we recommend implementing the following best practices:

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-08-07 UTC.

[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-07 UTC."],[[["Soft lockups occur when a VM's vCPU cannot run a new task for over 20 seconds, often due to application software bugs."],["Soft lockups can cause VMs to become unresponsive, disrupt SSH access, and trigger application timeouts or failovers."],["Soft lockups can be identified by reviewing serial port output or operating system logs for a soft lockup stack trace, such as `watchdog: BUG: soft lockup - CPU#3 stuck for 22s!`."],["Troubleshooting soft lockups involves checking for OS vendor known issues, identifying patterns with high load or environment changes, and evaluating maintenance events."],["Preventive measures include using redundant components, compute-optimized machine families for intense workloads, testing with simulated maintenance events, and keeping the OS up to date."]]],[]]


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4