This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Lambda GPU Cloud

Documentation for Lambda GPU Cloud, the #1 cloud provider for high-performance GPUs

1: Are API requests rate-limited?
2: Can I access my persistent storage without an instance?
3: Can I launch an instance from the command line?
4: Can I list my running instances from a command line?
5: Can I list the offered instance types from a command line?
6: Can I pause my instance instead of terminating it?
7: Can I remotely mount persistent storage?
8: Can I set a limit (quota) on my file system usage?
9: Can I use the Cloud API to add an SSH key to my account?
10: Can my data be recovered once I've terminated my instance?
11: Can you provide an estimate of how much a job will cost?
12: Do you support Kubernetes (K8s)?
13: How are on-demand instances invoiced?
14: How are persistent storage file systems billed?
15: How do I change my billing address?
16: How do I change my password?
17: How do I get started using the dashboard?
18: How do I get started using the Demos feature?
19: How do I get started using the Firewall feature?
20: How do I get started using the Team feature?
21: How do I import an SSH key from a GitHub account?
22: How do I learn my instance's private IP address and other info?
23: How do I list my file systems using the Cloud API?
24: How do I open Jupyter Notebook on my instance?
25: How do I restart an instance using the Cloud API?
26: How do I retrieve the details of an instance from a command line?
27: How do I terminate an instance using the Cloud API?
28: How do I use persistent storage to save datasets and system state?
29: How long does it take for instances to launch?
30: Is it possible to open ports other than for SSH?
31: Is it possible to use more than one SSH key?
32: What can I do with the Cloud API?
33: What happens to my account if I don't pay an invoice?
34: What is the capacity of persistent storage file systems?
35: What network bandwidth does Lambda GPU Cloud provide?
36: What should I do about timeout waiting for RPC from GSP errors?
37: What SSH key formats are supported?
38: Why am I seeing an error about NMI received for unknown reason?
39: Why are some instance types grayed out when I try to launch an instance?
40: Why can't my program find the NVIDIA cuDNN library?
41: Why is my card being declined?

Lambda GPU Cloud provides instant access to high-performance cloud GPUs at the best prices on the market.

Note

On January 1, 2024, the prices of certain instance types will be increasing. These increases will apply to both newly launched instances and instances already running. These price increases will help us add more GPUs throughout 2024 to address availability concerns from the community.

Note

Beginning December 13, 2023, new Lambda GPU Cloud instances will launch with Ubuntu 22.04 instead of Ubuntu 20.04. Currently running instances won’t be affected by this change.

Significantly, Ubuntu 22.04 includes Python 3.10, while Ubuntu 20.04 includes Python 3.8.

See the release notes to learn more about the changes introduced in Ubuntu 22.04.

Looking to use our new API?

Read our documentation

With Lambda GPU Cloud, you have:

TensorFlow, JupyterLab, PyTorch®, and other popular ML software pre-installed
persistent storage to save your datasets and other files
root access to your instances via SSH

1 - Are API requests rate-limited?

Requests to the Cloud API are generally limited to 1 request per second.

Requests to the /instance-operations/launch endpoint are limited to 1 request every 10 seconds.

Note

If you’re being rate limited, you’ll receive an HTTP 429 response status code in response to your request.

Note

The request rate limits may change at any time.

2 - Can I access my persistent storage without an instance?

You can’t access your persistent storage file systems without attaching them to an instance at the time the instance is launched.

For this reason, it’s recommended that you keep a local copy of the files you have saved in your persistent storage file systems.

Note

File systems can’t be attached to running instances.

Moreover, file systems can only be attached to instances in the same region. For example, a file system created in the us-west-1 (California, USA) region can only be attached to instances in the us-west-1 region.

File systems can’t be transferred from one region to another. However, you can copy data between file systems using tools such as rsync.

Lambda GPU Cloud currently doesn’t offer block or object storage.

Note

You’re billed for persistent storage usage whether or not your file systems are attached to an instance.

3 - Can I launch an instance from the command line?

You can launch an instance from the command line using the Cloud API:

Generate an API key.

Create a file named request.json that contains the necessary payload. For example:

{
  "region_name": "us-east-1",
  "instance_type_name": "gpu_1x_a100_sxm4",
  "ssh_key_names": [
    "SSH-KEY"
  ],
  "file_system_names": [],
  "quantity": 1
}

Run the following command:

curl -u API-KEY: https://cloud.lambdalabs.com/api/v1/instance-operations/launch -d @request.json -H "Content-Type: application/json" | jq .

Replace API-KEY with your actual API key. Don’t remove the trailing colon (:).

4 - Can I list my running instances from a command line?

You can list your running instances from a command line using the Cloud API.

First, generate an API key. Then, run the following command:

curl -u API-KEY: https://cloud.lambdalabs.com/api/v1/instances | jq .

Replace API-KEY with your actual API key. Don’t remove the trailing colon (:).

5 - Can I list the offered instance types from a command line?

You can list the instances types offered by Lambda GPU Cloud by first generating an API key, then running the following command:

curl -u API-KEY: https://cloud.lambdalabs.com/api/v1/instance-types | jq .

Replace API-KEY with your actual API key. Don’t remove the trailing colon (:).

6 - Can I pause my instance instead of terminating it?

It currently isn’t possible to pause (suspend) your instance rather than terminating it. But, this feature is in the works.

Until this feature is implemented, you can use persistent storage to imitate some of the benefits of being able to pause your instance.

7 - Can I remotely mount persistent storage?

Lambda GPU Cloud currently doesn’t support remote mounting of persistent storage.

Persistent storage file systems can’t be accessed without being attached to an instance.

8 - Can I set a limit (quota) on my file system usage?

Currently, you can’t set a limit (quota) on your persistent storage file system usage.

You can see the usage of a persistent storage file system from within an instance by running df -h -BG. This command will produce output similar to:

Filesystem           1G-blocks  Used   Available Use% Mounted on
udev                       99G    0G         99G   0% /dev
tmpfs                      20G    1G         20G   1% /run
/dev/vda1                1357G   23G       1335G   2% /
tmpfs                      99G    0G         99G   0% /dev/shm
tmpfs                       1G    0G          1G   0% /run/lock
tmpfs                      99G    0G         99G   0% /sys/fs/cgroup
persistent-storage 8589934592G    0G 8589934592G   0% /home/ubuntu/persistent-storage
/dev/vda15                  1G    1G          1G   6% /boot/efi
/dev/loop0                  1G    1G          0G 100% /snap/core20/1822
/dev/loop1                  1G    1G          0G 100% /snap/lxd/24061
/dev/loop2                  1G    1G          0G 100% /snap/snapd/18357
tmpfs                      20G    0G         20G   0% /run/user/1000

In the example output, above:

The name of the file system is persistent-storage.
The size of the file system is 8589934592G (8 exabytes).
The available capacity of the file system is 8589934592G.
The used percentage of the file system is 0%.
The file system is mounted on /home/ubuntu/persistent-storage.

Tip

You can also use the Cloud API’s /file-systems endpoint to find out your file system usage.

9 - Can I use the Cloud API to add an SSH key to my account?

You can use the Cloud API to:

Add an existing SSH key to your account.
Generate a new SSH key pair.
List the SSH keys saved in your account.
Delete an SSH key from your account.

Note

Following these instructions won’t add the SSH key to existing instances.

To add SSH keys to existing instances, read our FAQ on using more than one SSH key

Note

You can add up to 1,024 SSH keys to your account.

Add an existing SSH key to your account

To add an existing SSH key to your account:

Generate an API key if you don’t have one already.

Create a file named ssh-key.json that contains the necessary payload. For example:

{
  "name": "my-new-key",
  "public_key": "ssh-ed25519 KEY COMMENT"
}

Run the following command:

curl -u API-KEY: https://cloud.lambdalabs.com/api/v1/ssh-keys -d @ssh-key.json -H "Content-Type: application/json" | jq .

Replace API-KEY with your actual API key. Don’t remove the trailing colon (:).

Generate a new SSH key pair

To generate a new SSH key pair:

Generate an API key if you don’t have one already.

Run the following command:

curl -u API-KEY: https://cloud.lambdalabs.com/api/v1/ssh-keys -d '{ "name": "my-generated-key" }' -H "Content-Type: application/json" | jq -r '.data.private_key' > my-generated-private-key.pem

Replace API-KEY with your actual API key. Don’t remove the trailing colon (:).

The private key for your SSH key pair will be saved as my-generated-private-key.pem.

Run chmod 400 my-generated-private-key.pem to set the correct file permissions for your private key.

List the SSH keys saved in your account

To list the SSH keys saved in your account, generate an API key if you don’t already have one. Then, run the following command:

curl -u API-KEY: https://cloud.lambdalabs.com/api/v1/ssh-keys | jq .

Replace API-KEY with your actual API key. Don’t remove the trailing colon (:).

Delete an SSH key from your account

To delete an SSH key from your account, generate an API key if you don’t already have one. Then, run the following command:

curl -u API-KEY: -X DELETE https://cloud.lambdalabs.com/api/v1/ssh-keys/SSH-KEY-ID

Replace API-KEY with your actual API key. Don’t remove the trailing colon (:).

Replace SSH-KEY-ID with the ID of the SSH key you want to delete.

Note

Use the API to obtain the IDs of the SSH keys saved in your account.

10 - Can my data be recovered once I've terminated my instance?

Warning

We cannot recover your data once you’ve terminated your instance! Before terminating an instance, make sure to back up all data that you want to keep.

If you want to save data even after you terminate your instance, create a persistent filesystem.

Note

The persistent filesystem must be attached to your instance before you start your instance. The persistent filesystem cannot be attached to your instance after you start your instance.

When you create a persistent filesystem, a directory with the name of your persistent filesystem is created in your home directory. For example, if the name of your persistent filesystem is PERSISTENT-FILESYSTEM, the directory is created at /home/ubuntu/PERSISTENT-FILESYSTEM. Data not stored in this directory is erased once you terminate your instance and cannot be recovered.

11 - Can you provide an estimate of how much a job will cost?

We can’t estimate how much your job will cost or how long it’ll take to complete on one of our instances. This is because we don’t know the details of your job, such as how your program works.

However, the performance of our instances is close to what you’d expect from bare metal machines with the same GPUs.

In order to estimate how much your job will cost or how long it’ll take to complete, we suggest you create an instance and benchmark your program.

Tip

Check out our GPU benchmarks to form a general idea of the performance provided by our instances. Keep in mind that real-world performance doesn’t always match the performance provided by benchmarks.

For help benchmarking or optimizing your ML jobs, contact our Machine Learning team.

12 - Do you support Kubernetes (K8s)?

We currently don’t support Kubernetes, also known as K8s.

13 - How are on-demand instances invoiced?

Billing for on-demand instances is in one-minute increments. Billing starts when an instance is launched and the dashboard shows the instance’s status is Running.

Tip

The Cloud API’s /instances endpoint will also show the instance’s status is active.

Billing stops when an instance is terminated from the dashboard or using the Cloud API’s /terminate endpoint.

Note

You’re not billed for the time an instance’s status in the dashboard is Booting. Similarly, you’re not billed for the time an instance’s status in the dashboard is Terminating.

Warning

Be sure to terminate any instances that you’re not using!

You will be billed for all minutes that an instance is running, even if the instance isn’t actively being used.

The GPU Cloud dashboard allows you to view your resource usage.

Invoices are sent weekly for the previous week’s usage.

Note

On-demand instances require us to maintain excess capacity at all times so we can meet the changing workloads of our customers. For this reason, on-demand instances are priced higher than reserved instances.

Conversely, we offer reserved GPU Cloud instances at a significant savings over on-demand instances, since they allow us to more accurately determine our capacity needs ahead of time.

14 - How are persistent storage file systems billed?

Persistent storage is billed per GB used per month, in increments of 1 hour.

For example, based on the price of $0.20 per GB used per month:

If you use 1,000 GB of your file system capacity for an entire month (30 days, or 720 hours), you’ll be billed $200.00.
If you use 1,000 GB of your file system capacity for a single day (24 hours), you’ll be billed $6.67.

Note

The actual price of persistent storage will be displayed when you create your file system.

15 - How do I change my billing address?

To change your billing address: in the Cloud dashboard, at the bottom of the left sidebar, click Settings. Click the Billing tab, then click Edit billing address.

16 - How do I change my password?

To reset your Lambda Cloud password, visit the Reset Password page.

17 - How do I get started using the dashboard?

The dashboard makes it easy to get started using Lambda GPU Cloud.

From the dashboard, you can:

Launch, restart, and terminate instances
Create and manage persistent storage file systems
Add, generate, and delete SSH keys
Generate and delete API keys
Use the Demos feature
View usage
Manage a Team
Modify account settings

Launch, restart, and terminate instances

Launch instances

To launch an instance:

Click Instances in the left sidebar of the dashboard.

Then, click Launch instance at the top-right of the dashboard.
Click the instance type that you want to launch.
Click the region in which you want to launch the instance.
Click the persistent storage file system that you want to attach to your instance.

If you don’t want to or can’t attach a persistent storage file system to your instance, click Don’t attach a filesystem.
Select the SSH key that you want to use for your instance. Then, click Launch instance.

Tip
You can add additional SSH keys to your instance once your instance has launched.
Review the license agreements and terms of service. If you agree to them, click I agree to the above to launch your instance.

In the dashboard, you should now see your instance listed. Once your instance has finished booting, you’ll be provided with the details needed to begin using your instance.

Tip

You can also launch instances using the Cloud API.

You can also use the Cloud API to get details of a running instance.

Restart instances

Restart instances by clicking the checkboxes next to the instances you want to restart. Then, click Restart at the top-right of the dashboard.

Terminate instances

Terminate instances by clicking the checkboxes next to the instances you want to terminate. Then, click Terminate at the top-right of the dashboard.

When prompted to do so, type in erase data on instance, then click Terminate instances.

Tip

You can also terminate instances using the Cloud API.

Create and manage persistent storage file systems

Create a persistent storage file system

To create a persistent storage file system:

Click Storage in the left sidebar of the dashboard.

Then, click Create filesystem at the top-right of the dashboard.
Enter a name and select a region for your file system. Then click Create filesystem.

You should now see your persistent storage file system listed in the dashboard.

Add, generate, and delete SSH keys

Add or generate an SSH key

To add an SSH key that you already have:

Click SSH keys in the left sidebar of the dashboard.

Then, click Add SSH key at the top-right of the dashboard.
In the text input box, paste your public SSH key. Enter a name for your key, then click Add SSH key.

To generate a new SSH key:

Instead of pasting your public SSH key as instructed, above, click Generate a new SSH key. Type in a name for your key, then click Create.

The private key for your new SSH key will automatically download.

Tip

You can also use the Cloud API to add and generate SSH keys.

Delete SSH keys

Delete SSH keys by clicking Delete at the far-right of the SSH key you want to delete.

Generate and delete API keys

Generate API keys

Generate API keys by clicking API keys in the left sidebar of the dashboard.

Then, click Generate API Key at the top-right of the dashboard.

Delete API keys

Delete API keys by clicking Delete at the far-right of the API key you want to delete.

Use the Demos feature

Use the Demos feature by clicking Demos in the left sidebar of the dashboard.

View usage

View usage information by clicking Usage in the left sidebar of the dashboard.

Manage a Team

Click Team at the bottom of the left sidebar to access the Team feature.

Learn how to manage a Team by reading our FAQ on getting started with the Team feature.

Modify account settings

Click Settings at the bottom of the left sidebar to modify your account settings, including your password, payment method, and billing address.

18 - How do I get started using the Demos feature?

The Demos feature allows you to easily share your Gradio machine learning app (demo) both publicly and privately.

To get started using the Demos feature, you need to:

Add a demo to your Lambda GPU Cloud account.
Host your demo on a new instance.

Note

It currently isn’t possible to host a demo on an existing instance.

Note

The new instance hosting your demo can be used like any other Lambda GPU Cloud on-demand instance. For example, you can SSH into the instance and open Jupyter Notebook on the instance.

As with other Lambda GPU Cloud on-demand instances, you’re billed for all of the time the instance for your demo is running.

Note

The Demos feature can be hosted on multi-GPU instance types. However, Demos uses only one of the GPUs.

Also, demos currently can’t be hosted on H100 instances.

Add a demo to your Lambda GPU Cloud account

In the left sidebar of the dashboard, click Demos. Then, click the Add demo button at the top-right of the dashboard.

The Add a demo dialog will appear.
Under Demo Source URL, enter the URL of the Git repository containing your demo’s source code.
Note

The Demos feature looks in your Git repository for a file named README.md. If the file doesn’t exist, or if the file doesn’t contain the required properties, you’ll receive a Demo misconfigured error.

The README.md must have at the top a YAML block containing the following:
```
---
sdk: gradio
sdk_version: GRADIO-VERSION
app_file: PATH-TO-APP-FILE
---
```
Replace GRADIO-VERSION with the version of Gradio your demo is built with, for example, 3.24.1.

Replace PATH-TO-APP-FILE with the path to your Gradio application file (the file containing the Gradio interface code), relative to the root of your Git repository. For example, if your Gradio application file is named app.py and is located in the root directory of your Git repository, replace PATH-TO-APP-FILE with app.py.

Properties other than sdk, sdk_version, and app_file are ignored by the Demos feature.
Tip

If you don’t yet have your own demo, you can try the Demos feature using the demos created by Lambda’s Machine Learning team. Demos created by Lambda’s Machine Learning team include:
Under Visibility, choose:
- Public if you want to list your demo in the library of public models shared by the Lambda community.
- Unlisted if you want your demo accessible only by those who know your demo’s URL.
Under Name, give your demo a name. If you choose to make your demo public, the name of your demo will appear in the Lambda library of public models. The name of your demo will also appear in your demo’s URL.
(Optional) Under Description, enter a description for your demo.

The description shows under the name of your demo in your library of demos. If your demo is public, the description also shows under the name of your demo in the Lambda library of public models.

Note
You can’t change the name or description of your demo once you add it. However, you can delete your demo then add it again.
Click Add demo, then follow the prompts to launch a new instance to host your demo.

Tip
To host a demo that’s already added to your account, in the Demos dashboard, find the row containing the demo you want to host, then click Host.

Your new instance will take several minutes to launch and for your demo to become accessible.

Note

The link to your demo might temporarily appear in the Instances dashboard, then disappear. This is expected behavior and doesn’t mean your instance or demo is broken.

The models used by demos are often several gigabytes in size, and can take 5 to 15 minutes to download and load.
Once your instance is launched and your demo is accessible, a link with your demo’s name will appear under the Demo column. Click the link to access your demo.

Tip

To see a gallery of all of your demos, at the top-right of the Demos dashboard, click the See your demos button.

Troubleshooting demos

If you experience trouble accessing your demo, the Demos logs can be helpful for troubleshooting.

To view the Demos log files, SSH into your instance or open a terminal in Jupyter Notebook, then run:

sudo bash -c 'for f in /root/virt-sysprep-firstboot.log ~demo/bootstrap.log; do printf "### BEGIN $f\n\n"; cat $f; printf "\n### END $f\n\n"; done > demos_debug_logs.txt; printf "### BEGIN journalctl -u lambda-demos.service\n\n$(journalctl -u lambda-demos.service)\n\n### END journalctl -u lambda-demos.service" >> demos_debug_logs.txt'

This command will produce a file named demos_debug_logs.txt containing the logs for the Demos feature. You can review the logs from within your instance by running less demos_debug_logs.txt. Alternatively, you can download the file locally to review or share.

Note

The Lambda Support team provides only basic support for the Demos feature. However, assistance might be available in the community forum.

If you’re experiencing problems using the Demos feature, running the above command and providing the demos_debug_logs.txt file to the Support team can help with future improvements to the Demos feature.

Here are some examples of how problems present in logs:

Misconfigured README.md file

### BEGIN /home/demo/bootstrap.log

Cloning into '/home/demo/source'...
Traceback (most recent call last):
  File "<stdin>", line 17, in <module>
  File "<stdin>", line 15, in load
  File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 3 validation errors for Metadata
sdk
  field required (type=value_error.missing)
sdk_version
  field required (type=value_error.missing)
app_file
  field required (type=value_error.missing)
Created symlink /etc/systemd/system/multi-user.target.wants/lambda-demos-error-server.service → /etc/systemd/system/lambda-demos-error-server.service.
Bootstrap failed: misconfigured

### END /home/demo/bootstrap.log

Not a Gradio app

### BEGIN /home/demo/bootstrap.log

Cloning into '/home/demo/source'...
Traceback (most recent call last):
  File "<stdin>", line 17, in <module>
  File "<stdin>", line 15, in load
  File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 2 validation errors for Metadata
sdk
  unexpected value; permitted: 'gradio' (type=value_error.const; given=docker; permitted=('gradio',))
sdk_version
  field required (type=value_error.missing)
Created symlink /etc/systemd/system/multi-user.target.wants/lambda-demos-error-server.service → /etc/systemd/system/lambda-demos-error-server.service.
Bootstrap failed: misconfigured

### END /home/demo/bootstrap.log

19 - How do I get started using the Firewall feature?

The Firewall feature allows you to configure firewall rules to restrict incoming traffic to your instances.

Note

Firewall rules configured using the Firewall feature apply to all of your instances outside of the Texas, USA (us-south-1) region.

To use the Firewall feature:

Click Firewall in the left sidebar of the dashboard to open your firewall settings.

Under General Settings, use the toggle next to Allow ICMP traffic (ping) to allow or restrict incoming ICMP traffic to your instances.

Note
For network diagnostic tools such as ping and mtr to be able to reach your instances, you need to allow incoming ICMP traffic.
Next to Inbound Rules, click Edit to configure incoming TCP and UDP traffic rules.

In the drop-down menu under Type, select:
- Custom TCP to manually configure a rule to allow incoming TCP traffic.
- Custom UDP to manually configure a rule to allow incoming UDP traffic.
- HTTPS to automatically configure a rule to allow incoming HTTPS traffic.
- SSH to automatically configure a rule to allow incoming SSH traffic.
- All TCP to automatically configure a rule to allow all incoming TCP traffic.
- All UDP to automatically configure a rule to allow all incoming UDP traffic.
Warning
If you don’t have a rule to allow incoming traffic to port TCP/22, you won’t be able to access your instances using SSH.

In the Source field, either:
- Click the 🔎 to automatically enter your current IP address.
- Enter a single IP address, for example, 203.0.113.1.
- Enter an IP address range in CIDR notation, for example, 203.0.113.0/24.
To allow incoming traffic from any source, enter 0.0.0.0/0.

If you choose Custom TCP or Custom UDP, enter a Port range.

Port range can be:
- A single port, for example, 8080.
- A range of ports, for example, 8080-8081.
(Optional) Enter a Description for the rule.
(Optional) Click Add rule to add additional rules.
(Optional) Click the x next to any rule you want to delete.
Click Update to apply your changes.

Note

The maximum number of firewall rules you can have is 20.

If you have more than 20 rules, new instances you create might not launch. Also, it’s possible that not all of your rules will be active, which might leave your instances unsecure.

20 - How do I get started using the Team feature?

Create a team

In the dashboard, click Team at the bottom-left of the dashboard. Then, click Invite at the top-right of the Team dashboard.
Enter the email address of the person you want to invite to your team. Select their role in the team, either an Admin or a Member. Then, click Send invitation.
Warning

Be sure to invite only trusted persons to your team!

Currently, the only differences between the Admin and Member roles are that an Admin can:
- Invite others to the team.
- Remove others from the team.
- Modify payment information.
- Change the team name.
This means that a person with a Member role can, for example:
- Launch instances that will incur charges.
- Terminate instances that should continue to run.
Note

You can’t send an invitation to an email address already associated with a Lambda Cloud account. If you try to, you’ll be presented with a message that says there is already a Lambda Cloud account associated with the email address you’re trying to send an invitation to.

The person you’re inviting to your team must first close their existing Lambda Cloud account before they can be invited to your team.
The person you invited to your team will receive an email letting them know that they’ve been invited to a team on Lambda Cloud.

In that email, they should click Join the Team.

Note

Until the person you invited to your team accepts their invitation, they will be listed in the Team dashboard as Invitation pending.

You can delete the invitation while it’s pending by clicking ⋮ where the person is listed in your Team dashboard, then choosing Delete invitation.

Note
If the person you invited to your team doesn’t receive their invitation, you have to delete their invitation then invite them again.

In the Team dashboard of the person you invited to your team, the person will see that they are on your team. In your Team dashboard, you’ll see the person you invited listed.

Change a teammate’s role

To change the role of a person on your team from Member to Admin, click ⋮ where the person is listed in your Team dashboard, then choose Change to Admin.

Conversely, to change the role of a person on your team from Admin to Member, click ⋮ where the person is listed in your Team dashboard, then choose Change to Member.

Close a teammate’s account

To close a teammate’s account, click the ⋮ where your teammate is listed in your Team dashboard. Then, choose Deactivate user.

Warning

Carefully review the information in the dialog box that pops up.

Change team name

To change the name of your team, click Settings at the bottom-left of the dashboard, then click Edit team name. Enter a new name for your team, then click Update team name.

21 - How do I import an SSH key from a GitHub account?

To import an SSH key from a GitHub account and add it to your instance:

Using your existing SSH key, SSH into your instance.

Alternatively, open a terminal in Jupyter Notebook.
Import the SSH key from the GitHub account by running:
```
ssh-import-id gh:USERNAME
```
Replace USERNAME with the GitHub account’s username.

If the SSH key is successfully imported, ssh-import-id will output a message similar to:

2023-08-04 15:03:52,622 INFO Authorized key ['256', 'SHA256:C6pl0q4evVYZWcyByVF69D6fdbdKa7F8ei8V2F/bTW0', 'cbrownstein-lambda@github/67649580', '(ED25519)']
2023-08-04 15:03:52,623 INFO [1] SSH keys [Authorized]

If the SSH key isn’t successfully imported, ssh-import-id will output a message similar to:

2023-08-04 15:06:36,425 ERROR Username "fake-cbrownstein-lambda" not found at GitHub API. status_code=404 user=fake-cbrownstein-lambda

22 - How do I learn my instance's private IP address and other info?

You can learn your instance’s private IP address with the ip command.

You can learn what ports are open on your instance with the nmap command.

Learn your instance’s private IP address

To learn your instance’s private IP address, SSH into your instance and run:

ip -4 -br addr show | grep '10.'

The above command will output, for example:

enp5s0           UP             10.19.60.24/20

In the above example, the instance’s private IP address is 10.19.60.24.

Tip

If you want your instance’s private IP address and only that address, run the following command instead:

ip -4 -br addr show | grep -Eo '10\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'

The above command will output, for example:

10.19.60.24

Learn what ports on your instance are publicly accessible

You can use Nmap to learn what ports on your instance are publicly accessible, that is, reachable over the Internet.

Note

The instructions, below, assume you’re running Ubuntu on your computer.

First, install Nmap on your computer (not on your instance) by running:

sudo apt install -y nmap

Next, run:

nmap -Pn INSTANCE-IP-ADDRESS

Replace INSTANCE-IP-ADDRESS with your instance’s IP address, which you can get from the Cloud dashboard.

The command will output, for example:

Starting Nmap 7.80 ( https://nmap.org ) at 2023-01-11 13:22 PST
Nmap scan report for 129.159.46.35
Host is up (0.041s latency).
Not shown: 999 filtered ports
PORT   STATE SERVICE
22/tcp open  ssh

Nmap done: 1 IP address (1 host up) scanned in 6.42 seconds

In the above example, TCP port 22 (SSH) is publicly accessible.

Note

If nmap doesn’t show TCP/22 (SSH) or any other ports open, your:

Instance might be terminated. Check the GPU Instances dashboard to find out.
Firewall rules might be blocking incoming connections to your instance.

Note

nmap -Pn INSTANCE-IP-ADDRESS only scans the 1,000 most common TCP ports.

23 - How do I list my file systems using the Cloud API?

To list your persistent storage file systems using the Cloud API:

Generate an API key if you don’t already have an API key.
Run the following command:
```
curl -u API-KEY: https://cloud.lambdalabs.com/api/v1/file-systems | jq .
```
Replace API-KEY with your actual API key. Don’t remove the trailing colon (:).

24 - How do I open Jupyter Notebook on my instance?

To open Jupyter Notebook on your instance:

In the GPU instances dashboard, find the row for your instance.
Click Launch in the Cloud IDE column.

Tip

Watch Lambda’s GPU Cloud Tutorial with Jupyter Notebook video on YouTube to learn more about using Jupyter Notebook on Lambda GPU Cloud instances.

25 - How do I restart an instance using the Cloud API?

You can restart instances from the command line using the Cloud API:

Generate an API key if you haven’t already generated one.
Create a file that contains the necessary payload. For example:
```
{
  "instance_ids": [
    "0920582c7ff041399e34823a0be62549"
  ]
}
```
Note
Use the API to obtain the IDs of your instances.
Run the following command:
```
curl -u API-KEY: https://cloud.lambdalabs.com/api/v1/instance-operations/restart -d @INSTANCE-IDS -H "Content-Type: application/json" | jq .
```
Replace API-KEY with your actual API key. Don’t remove the trailing colon (:).

Replace INSTANCE-IDS with the name of the payload file you created in the previous step.

26 - How do I retrieve the details of an instance from a command line?

You can retrieve the details of an instance from a command line using the Cloud API.

First, generate an API key. Then, run the following command:

curl -u API-KEY: https://cloud.lambdalabs.com/api/v1/instances/INSTANCE-ID | jq .

Replace API-KEY with your actual API key. Don’t remove the trailing colon (:).

Replace INSTANCE-ID with the ID of the instance you want details about.

Note

Use the API to obtain the IDs of your instances.

27 - How do I terminate an instance using the Cloud API?

You can terminate instances from the command line using the Cloud API:

Generate an API key if you haven’t already generated one.
Create a file that contains the necessary payload. For example:
```
{
  "instance_ids": [
    "0920582c7ff041399e34823a0be62549"
  ]
}
```
Note
Use the API to obtain the IDs of your instances.
Run the following command:
```
curl -u API-KEY: https://cloud.lambdalabs.com/api/v1/instance-operations/terminate -d @INSTANCE-IDS -H "Content-Type: application/json" | jq .
```
Replace API-KEY with your actual API key. Don’t remove the trailing colon (:).

Replace INSTANCE-IDS with the name of the payload file you created in the previous step.

28 - How do I use persistent storage to save datasets and system state?

You can use the Lambda Cloud Storage feature to save:

Large datasets that you don’t want to re-upload every time you start an instance
The state of your system, including software packages and configurations

Note

You can have up to 24 persistent storage file systems.

Preserving the state of your system

For saving the state of your system, including:

Packages installed system-wide using apt-get
Python packages installed using pip
conda environments

We recommend creating containers using Docker or other software for creating containers.

You can also create a script that runs the commands needed to re-create your system state. For example:

sudo apt install PACKAGE_0 PACKAGE_1 PACKAGE_2 && \
pip install PACKAGE_3 PACKAGE_4 PACKAGE_5

Run the script each time you start an instance.

If you only need to preserve Python packages and not packages installed system-wide, you can create a Python virtual environment.

You can also create a conda environment.

Tip

For the highest performance when training, we recommend copying your dataset, containers, and virtual environments from persistent storage to your home directory. This can take some time but greatly increases the speed of training.

29 - How long does it take for instances to launch?

Single-GPU instances usually take 3-5 minutes to launch.

Multi-GPU instances usually take 10-15 minutes to launch.

Note

Jupyter Notebook and Demos can take a few minutes after an instance launches to become accessible.

Note

Billing starts the moment an instance begins booting.

30 - Is it possible to open ports other than for SSH?

By default, all ports are open to TCP and UDP traffic. ICMP traffic is also allowed by default.

Tip

You can use the Firewall feature to restrict incoming connections to your instances.

31 - Is it possible to use more than one SSH key?

It’s possible to allow more than one SSH key to access your instance. To do so, you need to add public keys to ~/.ssh/authorized_keys. You can do this with the echo command.

Tip

You can also import SSH keys from GitHub.

Note

This FAQ assumes that you’ve already generated another SSH key pair, that is, a private key and a public key.

Public keys look like this:

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIK5HIO+OQSyFjz0clkvg+48YAihYMo5J7AGKiq+9Alg8 user@hostname

SSH into your instance as you normally do and run:

echo 'PUBLIC-KEY' >> ~/.ssh/authorized_keys

Replace PUBLIC-KEY with the public key you want to add to your instance. Make sure to keep the single quotes (' ').

You should now be able to log into your instance using the SSH key you just added.

Tip

You can make sure the public key has been added by running:

cat ~/.ssh/authorized_keys

The last line of output should be the public key you just added.

32 - What can I do with the Cloud API?

With the Cloud API, you can:

33 - What happens to my account if I don't pay an invoice?

Warning

If an invoice remains unpaid after we’ve made 4 attempts to charge the card on file, we may suspend your account.

If your account is suspended, your running instances may be terminated and your files may be deleted without prior notice.

Eventually, all of your instances will be terminated and all of your persistent storage file systems will be deleted.

Your account will be permanently banned from Lambda GPU Cloud. Your account will be referred for collection. Legal action may be taken against you.

34 - What is the capacity of persistent storage file systems?

Each persistent storage file system has a capacity of 8 exabytes, or 8,000,000 terabytes, except for file systems created in the Texas, USA (us-south-1) region. The capacity of file systems in the Texas, USA (us-south-1) region is 10 terabytes.

You can have a total of 24 file systems.

35 - What network bandwidth does Lambda GPU Cloud provide?

Note

Some sites limit transfer speeds. This is known as bandwidth throttling.

Lambda GPU Cloud doesn’t limit your transfer speeds but can’t control other sites’ use of bandwidth throttling.

Further, real-world network bandwidth depends on a variety of factors, including the total number of connections opened by your applications and overall network utilization.

Utah, USA region (us-west-3)

The bandwidth between instances in our Utah, USA region (us-west-3) can be up to 200 Gbps.

The total bandwidth from this region to the Internet can be up to 20 Gbps.

Texas, USA region (us-south-1)

The bandwidth between instances in our Texas, USA region (us-south-1) can be up to 200 Gbps.

The total bandwidth from this region to the Internet can be up to 20 Gbps.

Note

We’re in the process of testing the network bandwidth in our other regions.

36 - What should I do about timeout waiting for RPC from GSP errors?

If you’re seeing in your instance’s logs error messages about Timeout waiting for RPC from GSP!, the system software installed on your instance needs to be upgraded.

Note

nvidia-smi might also produce output similar to the following:

+-------------------------------+----------------------+----------------------+
|   4  NVIDIA A100-SXM...  On   | 00000000:8C:00.0 Off |                    0 |
| N/A   34C    P0    60W / 400W |      0MiB / 81920MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   5  ERR!                On   | 00000000:91:00.0 Off |                 ERR! |
|ERR!  ERR! ERR!    ERR! / ERR! |      0MiB / 81920MiB |    ERR!      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+

To upgrade the system software installed on your instance, run:

sudo apt update && sudo apt full-upgrade && sudo reboot

Warning

The above command will reboot your instance.

37 - What SSH key formats are supported?

You can add SSH keys in the following formats using the dashboard or the Cloud API:

OpenSSH (the format ssh-keygen uses by default when generating keys)
RFC4716 (the format PuTTYgen uses when you save a public key)
PKCS8
PEM

Note

OpenSSH keys look like:

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIK5HIO+OQSyFjz0clkvg+48YAihYMo5J7AGKiq+9Alg8 foo@bar

RFC4716 keys begin with:
```
---- BEGIN SSH2 PUBLIC KEY ----
```
PKCS8 keys begin with:
```
-----BEGIN PUBLIC KEY-----
```
PEM keys begin with, for example:
```
-----BEGIN RSA PUBLIC KEY-----
```

38 - Why am I seeing an error about NMI received for unknown reason?

You can safely disregard the error message: “Uhhuh. NMI received for unknown reason […] .”

This error message might show up in, for example:

The log file /var/log/syslog.
The output of the command dmesg.
The output of the command journalctl.

The error message results from a bug in AMD’s newer processors, including processors used in our servers. The bug has no impact other than causing the “NMI received for unknown reason” error message to appear in system logs.

Tip

To learn more about the “NMI received for unknown reason” error message, see:

The Linux Kernel Mailing List (LKML) discussion about the bug.
The patch that suppresses the error message in newer versions of the Linux kernel.

39 - Why are some instance types grayed out when I try to launch an instance?

If you try to launch an instance from the dashboard and see that the instance type you want is grayed out, then we’re currently at capacity for that instance type.

40 - Why can't my program find the NVIDIA cuDNN library?

Unfortunately, the NVIDIA cuDNN license limits how cuDNN can be used on our instances.

On our instances, cuDNN can only be used by the PyTorch® framework and TensorFlow library installed as part of Lambda Stack.

Other software, including PyTorch and TensorFlow installed outside of Lambda Stack, won’t be able to find and use the cuDNN library installed on our instances.

Tip

Software outside of Lambda Stack usually looks for the cuDNN library files in /usr/lib/x86_64-linux-gnu. However, on our instances, the cuDNN library files are in /usr/lib/python3/dist-packages/tensorflow.

Creating symbolic links, or “symlinks,” for the cuDNN library files might allow your program to find the cuDNN library on our instances.

Run the following command to create symlinks for the cuDNN library files:

for cudnn_so in /usr/lib/python3/dist-packages/tensorflow/libcudnn*; do
  sudo ln -s "$cudnn_so" /usr/lib/x86_64-linux-gnu/
done

41 - Why is my card being declined?

Common reasons why card transactions are declined include:

The card is a debit card or a prepaid card

We don’t accept debit cards or prepaid cards. We only accept major credit cards.

The purchase is being made from a country we don’t support

We currently only support customers in the following regions:

United States
Canada
Chile
Iceland
United Arab Emirates
Saudi Arabia
South Africa
Israel
Taiwan
South Korea
Japan
Singapore
Australia
New Zealand
United Kingdom
Switzerland
European Union (except for Romania)

The purchase is being made while you’re connected to a VPN

Purchases made while using a VPN are flagged as suspicious.

The card issuer is denying our pre-authorization charge

We make a $10 pre-authorization charge to a card before accepting it for payment, similar to how gas stations and hotels do. If the card issuer denies the pre-authorization charge, then we can’t accept the card for payment.

Wrong CVV or ZIP Code is being entered

Card purchases won’t go through if the CVV (security code) is entered incorrectly. Also, card purchases will be denied if the ZIP Code doesn’t match with the card billing address.

If none of these are applicable to you, contact the Lambda Support team for help.