Question-49: Select correct statements with regards to Cloudera Private Cloud Base and Disk setup?
- Cloudera does not support more than 200 TB per data node.
- Cloudera does not support drives larger than 8 TB.
- Running CDP DC on storage platforms other than direct-attached physical disks can provide suboptimal performance.
- Cloudera Runtime and the majority of the Hadoop platform are optimized to provide high performance by distributing work across a cluster that can utilize data locality and fast local I/O.
Answer:
This Question is from QuickTechie Cloudera CDP Certification Preparation Kit.
Exp: Hard drives today come in many sizes. Popular drive sizes are 1-4 TB, although larger drives are
becoming more common. When picking a drive size the following points need to be considered.
- Lower Cost Per TB – The larger the drive, the cheaper the cost per TB, which makes for lower TCO.
- Replication Storms – Larger drives means drive failures will produce larger re-replication storms, which can take longer and saturate the network while impacting in-flight workloads.
- Cluster Performance – In general, drive size has little impact on cluster performance. The exception is when drives have different read/write speeds and a use case that leverages this gain. MapReduce is designed for long sequential reads and writes, so latency timings are generally not as important. HBase can potentially benefit from faster drives, but that is
dependent on a variety of factors, such as HBase access patterns and schema design; this also implies acquisition of more nodes. Impala and Cloudera Search workloads can also potentially benefit from faster drives, but for those applications the ideal architecture is to maintain as much data in memory as possible.
Cloudera does not support more than 100 TB per data node. You could use 12 x 8 TB spindles or 24 x 4 TB spindles. Cloudera does not support drives larger than 8 TB.
Running CDP DC on storage platforms other than direct-attached physical disks can provide suboptimal performance. Cloudera Runtime and the majority of the Hadoop platform are optimized to provide high performance by distributing work across a cluster that can utilize data locality and fast local I/O.
- Get All Questions & Answer for CDP Generalist Exam (CDP-0011) and trainings.
- Get All Questions & Answer for CDP Administrator - Private Cloud Base Exam CDP-2001 and trainings.
- Get All Questions & Answer for CDP Data Developer Exam CDP-3001 and trainings.
This Question is from QuickTechie Cloudera CDP Certification Preparation Kit.