Java程序辅导

C C++ Java Python Processing编程在线培训 程序编写 软件开发 视频讲解

客服在线QQ:2653320439 微信:ittutor Email:itutor@qq.com
wx: cjtutor
QQ: 2653320439
CSEP 590A | Home Logistics Lectures: Wednesday, 6:30-9:20pm, room CSE2 G010. Public resources: The lecture slides and assignments will be posted online as the course progresses. We are happy for anyone to use these resources, but we cannot grade the work of any students who are not officially enrolled in the class. Grading and evaluation: There will be 10 Colabs (20%), 4 homeworks (40%), and a course project (40%). Students should upload their submissions to GradeScope. (Entry Code: X3WYKY). More information here. Office hours: Information here. Contact: Students should ask all course-related questions on the EdDiscussion forum, where you will also find announcements. For external enquiries, personal matters, or in emergencies, you can email us at csep590a-instructors-instructors@cs.washington.edu. Please do not use other emails (such as the TAs' UW email IDs) for questions about the class, as these may not be answered in a timely manner. Instructor Tim Althoff Teaching Assistants Ken Gu (Head TA) Dong He   Hao Peng   Content What is this course about? The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. The emphasis will be on MapReduce and Spark as tools for creating parallel algorithms that can process very large amounts of data. Topics include: Frequent itemsets and Association rules, Near Neighbor Search in High Dimensional Data, Locality Sensitive Hashing (LSH), Dimensionality reduction, Recommendation Systems, Clustering, Link Analysis, Large scale supervised machine learning, Data streams, Mining the Web for Structured Data. This course is modeled after CS246: Mining Massive Datasets by Jure Leskovec at Stanford University. Reference Text The following text is useful, but not required. It can be downloaded for free, or purchased from Cambridge University Press. Leskovec-Rajaraman-Ullman: Mining of Massive Dataset Prerequisites Students are expected to have the following background: Knowledge of basic computer science principles and skills, at a level sufficient to write a reasonably non-trivial computer program (e.g., CS332, CS373 or equivalent are recommended). Good knowledge of Python and Java will be extremely helpful since several assignments will require the use of Spark/Hadoop. Familiarity with basic probability theory (any introductory probability course). Familiarity with writing rigorous proofs (e.g., CS311 or equivalent). Familiarity with basic linear algebra (e.g., Math 308 or equivalent). Familiarity with algorithmic analysis (e.g., CS332/CS373; CS417/CS421 would be more than necessary). Students may refer to the following materials for an overview and review of the expected background. Related questions can be posted on EdDiscussion or during Office Hours. Probability and Proof Techniques Linear Algebra Spark Tutorial (a video is available through Stanford CS246) Students may decide to enroll without knowledge of these prerequisites but expect an significant increase in work load to learn these concurrently (e.g. 10 hours per week per missing prerequisite as a rule of thumb). Accessibility & Accommodations Embedded in the core values of the University of Washington is a commitment to ensuring access to a quality higher education experience for a diverse student population. Disability Resources for Students (DRS) recognizes disability as an aspect of diversity that is integral to society and to our campus community. DRS serves as a partner in fostering an inclusive and equitable environment for all University of Washington students. The DRS office is in 011 Mary Gates Hall. Please see the UW resources at: http://depts.washington.edu/uwdrs/current-students/accommodations/. Washington state law requires that UW develop a policy for accommodation of student absences or significant hardship due to reasons of faith or conscience, or for organized religious activities. The UW’s policy, including more information about how to request an accommodation, is available at Religious Accommodations Policy: (https://registrar.washington.edu/staffandfaculty/religious-accommodations-policy/). Accommodations must be requested within the first two weeks of this course using the Religious Accommodations Request form: (https://registrar.washington.edu/students/religious-accommodations-request/). Schedule Note: Lectures will in person in room CSE2 G010. Lecture slides will be posted here shortly before each lecture. This schedule is subject to change. Date Description Course Materials Events Deadlines Wed Mar 30 Introduction; MapReduce and Spark Frequent Itemsets Mining Course Information: handout Suggested Readings: Ch1: Data Mining Ch2: Large-Scale File Systems and Map-Reduce Ch6: Frequent itemsets Start planning course project [Teams signup form] [Colab 0] [Colab 1] & Assignment 1 out TBD Recitation: Spark < Wed Apr 6 Locality-Sensitive Hashing Suggested Readings: Ch3: Finding Similar Items (Sect. 3.1-3.4 and 3.5-3.8) [Colab 2] out Colab 0, Colab 1 due TBD Recitation: Probability and Proof Techniques TBD Recitation: Linear Algebra Wed Apr 13 Clustering Dimensionality Reduction Suggested Readings: Ch7: Clustering (Sect. 7.1-7.4) Ch11: Dimensionality Reduction (Sect. 11.4) [Colab 3] & Assignment 2 out [handout, bundle file] Colab 2 & Assignment 1 due Wed Apr 20 Recommender Systems Suggested Readings: Ch9: Recommendation systems [Colab 4] out Colab 3 due , Project Proposal due (no late periods) Wed Apr 27 PageRank Link Spam and Introduction to Social Networks Suggested Readings: Ch5: Link Analysis (Sect. 5.1-5.5) Ch10: Analysis of Social Networks (Sect. 10.1-10.2, 10.6) [Colab 5] & Assignment 3 out [handout, bundle file] Colab 4 & Assignment 2 due Wed May 4 Community Detection in Graphs Graphs Representation Learning Suggested Readings: Ch10: Analysis of Social Networks (Sect. 10.3-10.5, 10.7-10.8) [Colab 6] out Colab 5 due Sun May 8 Project Milestone due (no late periods) Wed May 11 Large-Scale Machine Learning Suggested Readings: Ch12: Large-Scale Machine Learning [Colab 7] & Assignment 4 out [handout, bundle file] Colab 6 & Assignment 3 due Wed May 18 Mining Data Streams Suggested Readings: Ch4: Mining data streams (Sect. 4.1-4.7) [Colab 8] out Colab 7 due Wed May 25 Course Project Meetings Optimizing Submodular Functions Sign up for meeting slots on EdDiscussion Suggested Readings: TimeMachine: Timeline Generation for Knowledge-base Entities by Althoff, Dong, Murphy, Alai, Dang, Zhang. KDD 2015. [Colab 9] out Assignment 4 due Wed Jun 1 Causal Inference Colab 8 due TBD Colab 9 due & Final Report due & Presentation video due (no late periods) TBD Virtual Project Presentations