Reading Data From Cloud Storage Via Cloud Functions(通过云函数从云存储中读取数据)
我正在尝试快速验证在 Python 中构建数据处理管道的概念.为此,我想构建一个 Google 函数,当某些 .csv 文件被放入 Cloud Storage 时会触发该函数.
I am trying to do a quick proof of concept for building a data processing pipeline in Python. To do this, I want to build a Google Function which will be triggered when certain .csv files will be dropped into Cloud Storage.
我遵循 此 Google Functions Python 教程,而示例代码确实如此当文件被删除时触发函数创建一些简单的日志,我真的被困在我必须进行的调用才能真正读取数据的内容.我尝试搜索 SDK/API 指导文档,但找不到.
I followed along this Google Functions Python tutorial and while the sample code does trigger the Function to create some simple logs when a file is dropped, I am really stuck on what call I have to make to actually read the contents of the data. I tried to search for an SDK/API guidance document but I have not been able to find it.
如果这是相关的,一旦我处理了 .csv,我希望能够将我从中提取的一些数据添加到 GCP 的 Pub/Sub 中.
In case this is relevant, once I process the .csv, I want to be able to add some data that I extract from it into GCP's Pub/Sub.
The function does not actually receive the contents of the file, just some metadata about it.
您需要使用 google-cloud-storage
You'll want to use the google-cloud-storage
client. See the "Downloading Objects" guide for more details.
Putting that together with the tutorial you're using, you get a function like:
from import storage
storage_client = storage.Client()
def hello_gcs_generic(data, context):
bucket = storage_client.get_bucket(data['bucket'])
blob = bucket.blob(data['name'])
contents = blob.download_as_string()
# Process the file contents, etc...
- pytorch 中的自适应池是如何工作的? 2022-07-12
- padding='same' 转换为 PyTorch padding=# 2022-01-01
- python check_output 失败,退出状态为 1,但 Popen 适用于相同的命令 2022-01-01
- 如何在 Python 的元组列表中对每个元组中的第一个值求和? 2022-01-01
- 如何在 python3 中将 OrderedDict 转换为常规字典 2022-01-01
- 沿轴计算直方图 2022-01-01
- 如何将一个类的函数分成多个文件? 2022-01-01
- 使用Heroku上托管的Selenium登录Instagram时,找不到元素';用户名'; 2022-01-01
- 分析异常:路径不存在:dbfs:/databricks/python/lib/python3.7/site-packages/sampleFolder/data; 2022-01-01
- python-m http.server 443--使用SSL? 2022-01-01