Java程序辅导

C C++ Java Python Processing编程在线培训 程序编写 软件开发 视频讲解

客服在线QQ:2653320439 微信:ittutor Email:itutor@qq.com
wx: cjtutor
QQ: 2653320439
School of Computing and Information Technology                 Session:        Spring  2023 
University of Wollongong       Lecturer: Janusz R. Getta 
 
ISIT912 Big Data Management 
Assignment 1 
Published on 24 July 2023 
             
 
Scope 
This assignment includes the tasks related to implementation of HDFS application and 
implementation MapReduce applications. 
 
This assignment is due on Saturday, 19 August 2023, 7:00pm (sharp). 
 
This assignment is worth 10% of the total evaluation in the subject. 
 
The assignment consists of 4 tasks and specification of each task starts from a new page. 
 
Only electronic submission through Moodle at:  
https://moodle.uowplatform.edu.au/login/index.php 
will be accepted. A submission procedure is explained at the end of Assignment 1 specification. 
 
A policy regarding late submissions is included in the subject outline. 
 
Only one submission of Assignment 1 is allowed and only one submission per student is 
accepted. 
 
A submission marked by Moodle as "late" is always treated as a late submission no matter how 
many seconds it is late. 
 
A submission that contains an incorrect file attached is treated as a correct submission with all 
consequences coming from the evaluation of the file attached.  
 
All files left on Moodle in a state "Draft(not submitted)" will not be evaluated. 
 
A submission of compressed files (zipped, gzipped, rared, tared, 7-zipped, lhzed, … etc) is not 
allowed. The compressed files will not be evaluated.  
 
An implementation that does not compile well due to one or more syntactical and/or run time 
errors scores no marks. 
 
The first assignment is an individual assignment and it is expected that all its tasks will be 
solved individually without any cooperation with the other students.  However, it is allowed to 
declare in the submission comments that a particular component or task of this assignment has 
been implemented in cooperation with another student. In such a case evaluation of a task or 
component may be shared with another student. In all other cases plagiarism will result in a FAIL 
grade being recorded for entire assignment. If you have any doubts, questions, etc. please consult 
your lecturer or tutor during laboratory/tutorial classes or over e-mail. 
              
Task 1 (1 mark) 
Merging files in HDFS 
 
Read an analyse HDFS applications provided in the files FileSystemCat.java and 
FileSystemPut.java and available in a folder Resources attached to a 
specification of laboratory class for Week2 on Moodle. 
 
Use the applications FileSystemCat.java and FileSystemPut.java to 
implement in Java HDFS application, that merges two files located in HDFS into one file 
also located in HDFS. 
 
The application must have the following parameters. 
(1) A path to, and a name of the first input file in HDFS.  
(2) A path to, and a name of the second input file in HDFS. 
(3) A path to, and a new name of an output file to be created in HDFS. The file supposed 
to contain the contents of the first input file followed by the contents of the second 
input file. 
 
Implement the application and save its source code in a file solution1.java.  
 
Upload to two files to HDFS.  The contents, the name, and the locations of the files in 
HDSF are up to you. 
 
When ready, compile, create jar file, and process your application. Display the results 
created by the application. 
 
Use Hadoop to provide an evidence, that two files uploaded into HDFS has been 
successful merged in one file in HDFS. 
 
Deliverables 
A file solution1.txt that contains a listing of source code of your application, a 
report from compilation, creation of jar file, uploading to HDFS two small files for 
testing, listing of both files in HDFS, processing of the application and an evidence that 
that two files uploaded into HDFS has been successful merges in one file in HDFS. A file 
solution1.txt must be created through Copy/Paste of the contents of Terminal 
window into a file solution1.txt. No screen dumps are allowed and no screen 
dumps will be evaluated. 
              
Task 2 (2 marks) 
Implementation of a simple MapReduce application 
 
Read an analyse MapReduce application provided in a file Filter.java available in a 
folder Resources attached to a specification of laboratory class for Week3 on Moodle. 
 
The application has the functionality equivalent to the functionality of the following SQL 
statement:  
 
SELECT key, value 
FROM sequence-of-key-value-pairs 
WHERE value > given-value; 
 
An objective of this task is to use the Java code provided in a file Filter.java to 
implement a MapReduce application Solution2 that has the functionality equivalent to 
the functionality of the following SQL statement: 
 
SELECT item-name, price-per-unit * total-units 
FROM sales.txt 
WHERE price-per-unit * total-units > given-value; 
 
A single line in an input data set sales.txt must have the following format. 
 
item-name price-per-unit total-units 
 
For example: 
 
bolt 2 25 
washer 3 8 
screw 7 20 
nail 5 10 
screw 7 2 
bolt 2 20 
bolt 2 30 
drill 10 5 
washer 3 8 
 
The contents of a file sales.txt is up to you as long as it is consistent with a format 
explained above. 
 
A value of given-value must be passed through a parameter of your program. 
 
Save your solution in a file Solution2.java.  
 
When ready list Solution2.java in Terminal window, compile, create jar file, and 
process the application. List an input dataset sales.txt in Terminal window and the 
results created by the application. When completed, Copy and Paste all messages from a 
Terminal screen into a file solution2.txt. 
 
Deliverables 
A file solution2.txt with a listing of source code of your application, report from 
compilation, creating jar file, processing the application, listing of a file sales.txt 
and listing of the results of processing of MapReduce application Solution2.java. A 
file solution2.txt must be created through Copy/Paste of the contents of Terminal 
window into a file solution2.txt. No screen dumps are allowed and no screen 
dumps will be evaluated. 
              
  
 
Task 3 (3 marks) 
Implementation of a simple MapReduce application 
 
Read an analyse MapReduce application provided in a file MinMax.java available in a 
folder Resources attached to a specification of laboratory class for Week3 on Moodle. 
 
The application has the functionality equivalent to the functionality of the following SQL 
statement.  
 
SELECT key, MIN(value), MAX(value) 
  FROM sequence-of-key-value-pairs 
  GROUP BY key; 
 
An objective of this task is to use the Java code provided in a file MinMax.java to 
implement a MapReduce application Solution3 that has the functionality equivalent to 
the functionality of the following SQL statement. 
  
SELECT item-name, SUM(price-per-unit * total-units) 
FROM sales.txt 
GROUP BY item-name 
 
A single line in an input data set sales.txt must have the following format. 
item-name price-per-unit total-units 
 
For example: 
 
bolt 2 25 
washer 3 8 
screw 7 20 
nail 5 10 
screw 7 2 
bolt 2 20 
bolt 2 30 
drill 10 5 
washer 3 8 
 
The contents of a file sales.txt is up to you as long as it is consistent with a format 
explained above. 
 
Save your solution in a file Solution3.java.  
 
When ready list Solution3.java in Terminal window, compile, create jar file, and 
process the application. List an input dataset sales.txt in Terminal window and the 
results created by the application. When completed, Copy and Paste all messages from a 
Terminal screen into a file solution3.txt. 
 
Deliverables 
A file solution3.txt with a listing of source code of your application, report from 
compilation, creating jar file, processing the application, listing a file sales.txt and 
listing of the results of processing of MapReduce application Solution3.java. A file 
solution3.txt must be created through Copy/Paste of the contents of Terminal 
window into a file solution3.txt. No screen dumps are allowed and no screen 
dumps will be evaluated. 
             
 
 
 
  
Task 4 (4 marks) 
Describing MapReduce application 
 
The files orders.txt and details.txt contain information about the orders 
submitted by the customers and the details of each order.  
 
A single line in a file orders.txt has the following structure: 
order-number date customer-id 
 
For example: 
0000001 12-JUN-2022 CUST-A02 
0000002 12-JUN-2022 CUST-A01 
0000003 13-JUN-2022 CUST-A02 
0000004 15-JUL-2022 CUST-F01 
0000005 16-JUL-2022 CUST-A01 
 
A single line in a file details.txt has the following structure: 
order-number item price 
 
For example: 
0000001 bolt 15 
0000001 screw 5 
0000002 screw 5 
0000002 bolt 10 
0000002 bigbolt 50 
 
An objective of this task is to describe MapReduce application that computes the total 
amount of money spent by all customers in a given year on a given item. For example, 
the total amount of money spent on bolts in 2022, or the total amount of money spent 
on screws in 2020. 
 
A description of the application must include the following details: 
- preparation of data for processing, 
- the parameters of the application, 
- a detailed description of Driver, 
- a detailed description of Mapper, 
- a detailed description of Reducer, 
- accessing the results. 
 
The detailed descriptions of Driver, Mapper and Reducer must contain all information 
needed for the implementations of Driver, Mapper and reducer. You can use pseudocode 
whenever it is necessary.  
 
Save your description of MapReduce application that computes the total amount of 
money spent by all customers in a given year on a given item in a file solution4.pdf.  
 
Deliverables 
A file solution4.pdf with a detailed description of MapReduce application that 
computes the total amount of money spent by all customers in a given year on a given 
item.   
              
  
Submission of Assignment 1 
 
Note, that you have only one submission. So, make it absolutely sure that you submit 
the correct files with the correct contents. No other submission is possible ! 
 
Submit the files solution1.txt, solution2.txt, solution3.txt, and 
solution4.pdf through Moodle in the following way: 
(1) Access Moodle at http://moodle.uowplatform.edu.au/ 
(2) To login use a Login link located in the right upper corner the Web page or in 
the middle of the bottom of the Web page 
(3) When logged select a site ISIT312/912 (S223) Big Data 
Management  
(4) Scroll down to a section Assessment items (Assignments) 
(5) Click at In this place you can submit the outcomes of your 
work on the tasks included in Assignment 1 link. 
(6) Click at a button Add Submission 
(7) Move a file solution1.txt into an area You can drag and drop 
files here to add them. You can also use a link Add… 
(8) Repeat step (7) for the remaining files solution2.txt, solution3.txt, 
and solution4.pdf  
(9) Click at a button Save changes 
(10) Click at the checkbox with a text attached: By checking this box, I 
confirm that this submission is my own work, … in order to 
confirm the authorship of your submission. 
(11) Click at a button Continue 
(12) Check if Submission status is Submitted for grading. 
             
End of specification