01. When MaxCompute saves data to the system, the _________ file format is used to reduce invalid disk read operations.
a) Column-store
b) Row-store
c) Key-value pair storage
d) Document storage
02. Jack is analyzing natural languages by using MaxCompute. He has collected 1000 news articles from portal websites with each piece of news saved as a record.
Now, he wants to extract terms from these records (i.e. split contents of each record into multiple independent terms and save them in a separate spreadsheet) with each term saved as an individual record. Jack is familiar with user-defined functions (UDF), and he wants to compile a UDF to extract terms for him.
From the perspective of functionality, please help him decide which type of UDF is more suitable for this scenario?
a) User Defined Scalar Function
b) User Defined Table Valued Function
c) User Defined Aggregation Function
d) User Defined Group Function
03. In DataWorks, a task is instantiated before a scheduled task is run, that is, an instance is generated and executed to run the scheduled task. Instances displayed in the Task Operation & Management(O&M) view of the O&M Center are automatically scheduled by tasks.
We can perform the ________ operations on these instances.
(Number of correct answers: 2)
a) Test
b) Data population
c) Viewing node run logs
d) Re-running and resuming scheduling
04. When we use the MaxCompute tunnel command to upload the log.txt file to the t_log table, the t_log is a partition table and the partitioning column is (p1 string, p2 string).
Which of the following commands is correct?
a) tunnel upload log.txt t_log/(p1="b1",p2="b2")
b) tunnel upload log.txt t_log/p1="b1",p2="b2"
c) tunnel upload log.txt t_log/p1="b1"/p2="b2"
d) tunnel upload log.txt t_log(p1="b1",p2="b2")
05. Authorization in MaxCompute refers to assigning some permissions of some objects to some specified users. Which three factors of the following must be included?
(Number of correct answers: 3)
a) Subject, which may be a user or role
b) Object, which may be a table or resource
c) Action, such as read, write, etc.
d) Effect, such as accept, reject, etc.
06. MaxCompute supports label-based security (Label Security), which is a mandatory access control (MAC) strategy of the project level. Its introduction is to make the project administrator control user access to sensitive data more flexibly.
Which granularity of sensitive data can be controlled by Label Security?
a) Table;
b) Partition;
c) Row;
d) Column;
07. In DataWorks, SQL Task1 is a daily scheduled periodic task. Task1 will use the data of a partition in Table2 (the Table2 partition field name is ds, in the format of yyyymmdd) every time Task1 runs.
The Table2 partition value is the first day of the month of the current business date (for example, the run time is May 10, 2018, then the business date will be 20180509, and the Table2 partition value will be 20180501).
To use the time parameters provided by the scheduling system to define the Table2 partitions so that the time value is automatically replaced at every scheduled run of Task1, how should we configure Task1?
a) In Task1 code, Table 2 partition ds='${var}01' and the parameter configuration: var=$[yyyymm-1].
b) In Task1 code, Table 2 partition ds='${var}' and the parameter configuration: var=$[yyyymmdd].
c) In Task1 code, Table 2 partition ds='${var}' and the parameter configuration: var=$[yyyymm]01.
d) In Task1 code, Table 2 partition ds='${var}' and the parameter configuration: var=$[yyyymm01].
08. When odpscmd is used to connect to a project in MaxCompute, the command ______ can be executed to view the size of the space occupied by table table_a.
a) select size from table_a;
b) size table_a;
c) desc table_a;
d) show table table_a;
09. About the job and execution plan, in E-MapReduce, which of the following descriptions are correct?
(Number of correct answers: 3)
a) In E-MapReduce, to create a job is to create a configuration about how to run the job. A job cannot be run directly. The configuration of a job must contain the jar package to be run for the job, the input and output addresses of data, and some running parameters.
b) You cannot view job logs on the worker nodes in E-MapReduce.
c) The execution plan is a bond that associates the job and the cluster
d) Through the execution plan, multiple jobs can be combined into a job sequence and prepare a running cluster for the job (or automatically create a temporary cluster or associate an existing cluster).
10. The distributed system consists of hundreds or even thousands of storage machines built from inexpensive commodity parts and is accessed by a comparable number of client machines.
The quantity and quality of the components virtually guarantee that some are not functional at any given time and some will not recover from their current failures.
Based on the information above, when you try to design a distributed system, the first thing to consider would be ____.
a) component failures
b) files are huge by traditional standards
c) data locality
d) metadata management