We can’t do this without you! Help us stay online and support free content.  Click to Donate

The Cloud Dataproc approach allows organizations to use Hado

The Cloud Dataproc approach allows organizations to use Hadoop/Spark/Hive/Pig when needed. It takes on average only 90 seconds between the moment resources are requested and a job can be submitted. What makes this possible?

Question: The Cloud Dataproc approach allows organizations to use Hadoop/Spark/Hive/Pig when needed. It takes on average only 90 seconds between the moment resources are requested and a job can be submitted. What makes this possible?

  • The separation of storage and compute.
  • The use of queries and containers.
  • The configuration of jobs and workflows.
  • The absence of management and maintenance.

Explanation

Dataproc can start Hadoop, Spark, Hive, and Pig clusters quickly because data can remain in Google Cloud storage while processing resources are created only when needed. This lets clusters be temporary rather than long-running. Keeping persistent data outside the cluster reduces setup time and avoids tying storage capacity to compute capacity. The business value is faster processing with less idle infrastructure.

Why the other options are incorrect

Queries and containers describes unrelated execution concepts, not the Dataproc operating model.

Jobs and workflows help organize processing, but they do not explain rapid cluster availability.

Absence of management and maintenance overstates the value because configuration and operational choices still exist.

Source for verification

https://cloud.google.com/dataproc/docs/concepts/overview

https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage

The answer(s) to the question is highlighted in the BOLD text above. You can also find more questions and answers related to the exams on the "Google Cloud Platform Business Professional" page.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top