Extract Google Drive zip from Google colab notebook(从 Google colab 笔记本中提取 Google Drive zip)
问题描述
我已经在谷歌驱动器上有一个(2K 图像)数据集的压缩包.我必须在 ML 训练算法中使用它.下面的代码以字符串格式提取内容:
I already have a zip of (2K images) dataset on a google drive. I have to use it in a ML training algorithm. Below Code extracts the content in a string format:
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import io
import zipfile
# Authenticate and create the PyDrive client.
# This only needs to be done once per notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
# Download a file based on its file ID.
#
# A file ID looks like: laggVyWshwcyP6kEI-y_W3P8D26sz
file_id = '1T80o3Jh3tHPO7hI5FBxcX-jFnxEuUE9K' #-- Updated File ID for my zip
downloaded = drive.CreateFile({'id': file_id})
#print('Downloaded content "{}"'.format(downloaded.GetContentString(encoding='cp862')))
但我必须将其提取并存储在单独的目录中,因为这样更容易处理(以及理解)数据集.
But I have to extract and store it in a separate directory as it would be easier for processing (as well as for understanding) of the dataset.
我试图进一步提取它,但得到不是 zipfile 错误"
I tried to extract it further, but getting "Not a zipfile error"
dataset = io.BytesIO(downloaded.encode('cp862'))
zip_ref = zipfile.ZipFile(dataset, "r")
zip_ref.extractall()
zip_ref.close()
Google 云端硬盘数据集
注意:数据集仅供参考,我已经将这个 zip 下载到我的 google 驱动器中,我指的只是驱动器中的文件.
Note: Dataset is just for reference, I have already downloaded this zip to my google drive, and I'm referring to file in my drive only.
推荐答案
你可以简单的使用这个
!unzip file_location
这篇关于从 Google colab 笔记本中提取 Google Drive zip的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:从 Google colab 笔记本中提取 Google Drive zip


- 分析异常:路径不存在:dbfs:/databricks/python/lib/python3.7/site-packages/sampleFolder/data; 2022-01-01
- 如何将一个类的函数分成多个文件? 2022-01-01
- python check_output 失败,退出状态为 1,但 Popen 适用于相同的命令 2022-01-01
- python-m http.server 443--使用SSL? 2022-01-01
- 使用Heroku上托管的Selenium登录Instagram时,找不到元素';用户名'; 2022-01-01
- padding='same' 转换为 PyTorch padding=# 2022-01-01
- 沿轴计算直方图 2022-01-01
- 如何在 Python 的元组列表中对每个元组中的第一个值求和? 2022-01-01
- pytorch 中的自适应池是如何工作的? 2022-07-12
- 如何在 python3 中将 OrderedDict 转换为常规字典 2022-01-01