Google Cloud Life Sciences 操作器¶
Google Cloud Life Sciences 是一項在 Google Cloud 上執行系列計算引擎容器的服務。它用於大規模處理、分析和標註基因組學和生物醫學資料。
警告
Cloud Life Sciences 將於 2025 年 7 月 8 日停止服務。請改用 Google Cloud Batch。
先決任務¶
要使用這些操作器,您必須完成以下幾項工作
使用 Cloud Console 選擇或建立一個 Cloud Platform 專案。
為您的專案啟用結算功能,詳見 Google Cloud 文件。
啟用 API,詳見 Cloud Console 文件。
透過 pip 安裝 API 庫。
pip install 'apache-airflow[google]'詳細資訊請參閱 安裝。
執行流水線¶
使用 LifeSciencesRunPipelineOperator 執行流水線。
此操作器已被棄用,並將在 2025 年 7 月 8 日後移除。所有功能和新特性均可在 Google Cloud Batch 平臺獲取。請使用 CloudBatchSubmitJobOperator
tests/system/google/cloud/cloud_batch/example_cloud_batch.py
def _create_job():
runnable = batch_v1.Runnable()
runnable.container = batch_v1.Runnable.Container()
runnable.container.image_uri = "gcr.io/google-containers/busybox"
runnable.container.entrypoint = "/bin/sh"
runnable.container.commands = [
"-c",
"echo Hello world! This is task ${BATCH_TASK_INDEX}.\
This job has a total of ${BATCH_TASK_COUNT} tasks.",
]
task = batch_v1.TaskSpec()
task.runnables = [runnable]
resources = batch_v1.ComputeResource()
resources.cpu_milli = 2000
resources.memory_mib = 16
task.compute_resource = resources
task.max_retry_count = 2
group = batch_v1.TaskGroup()
group.task_count = 2
group.task_spec = task
policy = batch_v1.AllocationPolicy.InstancePolicy()
policy.machine_type = "e2-standard-4"
instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
instances.policy = policy
allocation_policy = batch_v1.AllocationPolicy()
allocation_policy.instances = [instances]
job = batch_v1.Job()
job.task_groups = [group]
job.allocation_policy = allocation_policy
job.labels = {"env": "testing", "type": "container"}
job.logs_policy = batch_v1.LogsPolicy()
job.logs_policy.destination = batch_v1.LogsPolicy.Destination.CLOUD_LOGGING
return job
參考¶
更多資訊請參考