> ## Documentation Index
> Fetch the complete documentation index at: https://docs.vast.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Debugging

> Learn how to debug issues with Vast.ai Serverless. Understand the worker errors, increasing and decreasing load, and how to check the instance logs.

<script
  type="application/ld+json"
  dangerouslySetInnerHTML={{
__html: JSON.stringify({
  "@context": "https://schema.org",
  "@type": "HowTo",
  "name": "How to Debug Vast.ai Serverless Issues",
  "description": "A guide to debugging Vast.ai Serverless issues including checking worker errors, managing increasing load, and handling decreasing load.",
  "step": [
    {
      "@type": "HowToStep",
      "name": "Check Worker Errors and Logs",
      "text": "The Vast PyWorker framework automatically detects some errors, while others may cause instance timeout. To debug, check the instance logs via the logs button on the instance page in the GUI. For further investigation, SSH into the instance and find the model backend logs location by running 'echo $MODEL_LOG' and PyWorker logs by running 'echo ${WORKSPACE_DIR:-/workspace}/pyworker.log'."
    },
    {
      "@type": "HowToStep",
      "name": "Handle Increasing Load",
      "text": "To handle high load on instances: Set test_workers high to create more instances initially for Worker Groups with anticipated high load. Adjust cold_workers to keep enough workers around to prevent destruction during low initial load. Increase cold_mult to quickly create instances by predicting higher future load. Check max_workers to ensure it's set high enough."
    },
    {
      "@type": "HowToStep",
      "name": "Manage Decreasing Load",
      "text": "To manage decreasing load: Reduce cold_workers to stop instances quickly when load decreases to avoid unnecessary costs. The serverless system will handle this automatically, but manual adjustment can help if needed."
    }
  ]
})
}}
/>

## Worker Errors

The [Vast PyWorker](https://github.com/vast-ai/pyworker/tree/main) framework automatically detects some errors, while others may cause the instance to timeout. When an error is detected, the Serverless system will destroy or
reboot the instance. To manually debug an issue, check the instance logs available via the logs button on the instance page in the GUI. All PyWorker issues will be logged here.
If further investigation is needed, ssh into the instance and find the model backend logs location by running:

```sh Text   theme={null}
echo "$MODEL_LOG"
```

And PyWorker logs:

```sh Text theme={null}
echo "${WORKSPACE_DIR:-/workspace}/pyworker.log"
```

### Increasing Load

To handle high load on instances:

* **Adjust&#x20;**`cold_workers`: Keep enough workers around to prevent them from being destroyed during low initial load.
* **Increase&#x20;**`cold_mult`: Quickly create instances by predicting higher future load based on current high load. Adjust back down once enough instances are created.
* **Check&#x20;**`max_workers`: Ensure this parameter is set high enough to create the necessary number of workers.

### Decreasing Load

To manage decreasing load:

* **Reduce&#x20;**`cold_workers`: Stop instances quickly when the load decreases to avoid unnecessary costs. The serverless system will handle this automatically, but manual adjustment can help if needed.
