Why PyTorch nn.Module.cuda() not moving Module tensor but only parameters and buffers to GPU?(为什么 PyTorch nn.Module.cuda() 不移动模块张量而只移动参数和缓冲区到 GPU?)
问题描述
nn.Module.cuda() 将所有模型参数和缓冲区移动到 GPU.
但为什么不是模型成员张量?
class ToyModule(torch.nn.Module):def __init__(self) ->没有任何:super(ToyModule, self).__init__()self.layer = torch.nn.Linear(2, 2)self.expected_moved_cuda_tensor = torch.tensor([0, 2, 3])def forward(self, input: torch.Tensor) ->火炬.张量:返回 self.layer(input)toy_module = ToyModule()toy_module.cuda()next(toy_module.layer.parameters()).device>>>设备(类型 ='cuda',索引 = 0)
对于模型成员张量,设备保持不变.
<预><代码>>>>toy_module.expected_moved_cuda_tensor.device设备(类型='cpu')
如果你在模块内定义了一个张量,它需要被注册为参数或缓冲区,以便模块知道它.
<小时>Parameters 是要训练的张量,由 model.parameters()
返回.它们很容易注册,您需要做的就是将张量包装在 nn.Parameter
类型中,它将被自动注册.请注意,只有浮点张量可以作为参数.
class ToyModule(torch.nn.Module):def __init__(self) ->没有任何:super(ToyModule, self).__init__()self.layer = torch.nn.Linear(2, 2)# 将 expected_moved_cuda_tensor 注册为可训练参数self.expected_moved_cuda_tensor = torch.nn.Parameter(torch.tensor([0., 2., 3.]))def forward(self, input: torch.Tensor) ->火炬.张量:返回 self.layer(input)
<小时>
Buffers 是将在模块中注册的张量,因此像 .cuda()
这样的方法会影响它们,但它们不会返回通过 model.parameters()
.缓冲区不限于特定的数据类型.
class ToyModule(torch.nn.Module):def __init__(self) ->没有任何:super(ToyModule, self).__init__()self.layer = torch.nn.Linear(2, 2)# 注册 expected_moved_cuda_tensor 作为缓冲区# 注意:这会创建一个名为 expected_moved_cuda_tensor 的新成员变量self.register_buffer('expected_moved_cuda_tensor', torch.tensor([0, 2, 3])))def forward(self, input: torch.Tensor) ->火炬.张量:返回 self.layer(input)
<小时>
在上述两种情况下,以下代码的行为相同
<预><代码>>>>toy_module = ToyModule()>>>toy_module.cuda()>>>下一个(toy_module.layer.parameters()).device设备(类型 ='cuda',索引 = 0)>>>toy_module.expected_moved_cuda_tensor.device设备(类型 ='cuda',索引 = 0)nn.Module.cuda()
moves all model parameters and buffers to the GPU.
But why not the model member tensor?
class ToyModule(torch.nn.Module):
def __init__(self) -> None:
super(ToyModule, self).__init__()
self.layer = torch.nn.Linear(2, 2)
self.expected_moved_cuda_tensor = torch.tensor([0, 2, 3])
def forward(self, input: torch.Tensor) -> torch.Tensor:
return self.layer(input)
toy_module = ToyModule()
toy_module.cuda()
next(toy_module.layer.parameters()).device
>>> device(type='cuda', index=0)
for the model member tensor, the device stays unchanged.
>>> toy_module.expected_moved_cuda_tensor.device
device(type='cpu')
If you define a tensor inside the module it needs to be registered as either a parameter or a buffer so that the module is aware of it.
Parameters are tensors that are to be trained and will be returned by model.parameters()
. They are easy to register, all you need to do is wrap the tensor in the nn.Parameter
type and it will be automatically registered. Note that only floating point tensors can be parameters.
class ToyModule(torch.nn.Module):
def __init__(self) -> None:
super(ToyModule, self).__init__()
self.layer = torch.nn.Linear(2, 2)
# registering expected_moved_cuda_tensor as a trainable parameter
self.expected_moved_cuda_tensor = torch.nn.Parameter(torch.tensor([0., 2., 3.]))
def forward(self, input: torch.Tensor) -> torch.Tensor:
return self.layer(input)
Buffers are tensors that will be registered in the module so methods like .cuda()
will affect them but they will not be returned by model.parameters()
. Buffers are not restricted to a particular data type.
class ToyModule(torch.nn.Module):
def __init__(self) -> None:
super(ToyModule, self).__init__()
self.layer = torch.nn.Linear(2, 2)
# registering expected_moved_cuda_tensor as a buffer
# Note: this creates a new member variable named expected_moved_cuda_tensor
self.register_buffer('expected_moved_cuda_tensor', torch.tensor([0, 2, 3])))
def forward(self, input: torch.Tensor) -> torch.Tensor:
return self.layer(input)
In both of the above cases the following code behaves the same
>>> toy_module = ToyModule()
>>> toy_module.cuda()
>>> next(toy_module.layer.parameters()).device
device(type='cuda', index=0)
>>> toy_module.expected_moved_cuda_tensor.device
device(type='cuda', index=0)
这篇关于为什么 PyTorch nn.Module.cuda() 不移动模块张量而只移动参数和缓冲区到 GPU?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:为什么 PyTorch nn.Module.cuda() 不移动模块张量而只移


- 使用公司代理使Python3.x Slack(松弛客户端) 2022-01-01
- 计算测试数量的Python单元测试 2022-01-01
- 我如何卸载 PyTorch? 2022-01-01
- YouTube API v3 返回截断的观看记录 2022-01-01
- ";find_element_by_name(';name';)";和&QOOT;FIND_ELEMENT(BY NAME,';NAME';)";之间有什么区别? 2022-01-01
- 检查具有纬度和经度的地理点是否在 shapefile 中 2022-01-01
- 如何使用PYSPARK从Spark获得批次行 2022-01-01
- 我如何透明地重定向一个Python导入? 2022-01-01
- CTR 中的 AES 如何用于 Python 和 PyCrypto? 2022-01-01
- 使用 Cython 将 Python 链接到共享库 2022-01-01