![]() | |
It is a programme ideal for
- those who are interested in acquiring skills in big data analytics/artificial intelligence, and
- those who wish to pursue further study in the field of data science after studying science, social sciences, engineering, medical sciences, information systems, computing and data analytics in their undergraduate studies.
Admission Requirements
To be eligible for admission to the programme, you should have:
- A Bachelor's degree with honours, or an equivalent qualification;
- Applicants should have taken at least one university or post-secondary certificate course in each of the following three subjects (calculus and algebra, computer programming and introductory statistics) or related areas; and
- Fulfil the University Entrance Requirements.
- Main Round: November 2, 2020 to December 15, 2020. Candidates who apply within this period will have priority.
- Clearing Round: December 16, 2020 to 12 noon February 2, 2021.
Applications can be submitted via our on-line application system here.
The tuition fee for the programme is HK$264,000* for the 2021-22 intake. The fee shall normally be payable in three instalments over 1.5 years for full-time study or in five instalments over 2.5 years for part-time study.
In addition, students are required to pay Caution Money (HK$350), refundable on graduation subject to no claims being made, and Graduation Fee (HK$350).
* Subject to approval
Targeted Taught Postgraduate Programmes Fellowships Scheme
MDASC is sponsored by University Grants Committee (UGC) for Targeted Taught Postgraduate Programmes Fellowships Scheme. Local offer recipients who will be students of MDASC in the academic year 2021-22 are eligible, full-time or part-time alike, for application (other terms and conditions apply). Successful Fellowship Scheme applicants will each receive an award of HK$120,000.
Reimbursable Course(s) by Continuing Education Fund (CEF)
The following courses have been included in the list of reimbursable courses for CEF purposes:
COMP7503 Multimedia Technologies
COMP7506 Smart Phone Apps Development
COMP7507 Visualization and Visual Analytics
COMP7906 Introduction to Cyber Security
STAT8017 Data mining techniques
STAT8019 Marketing analytics
The mother programme (Master of Data Science) of these courses is recognized under the Qualification Framework (QF Level 6)
Suoxinda Scholarship in Data Science
Two scholarship recipients, each receiving HK$20,000, would be selected from the students entering the Master of Data Science programme on the basis of academic merit and admission interview performance.
Programme Information
Master of Data Science (MDASC) is a taught master programme jointly offered by Department of Statistics and Actuarial Science (host) and Department of Computer Science.
Its interdisciplinarity promotes the applications of computer technology, operational research, statistical modelling, and simulation to decision-making and problem-solving in all organizations and enterprises within the private and public sectors.
The curriculum of the MDASC programme adopts a well-balanced and comprehensive pedagogy of both statistical and computational concepts and methodologies, underpinning applications that are not limited to business or a single field alone.
It is a programme ideal for
- those whose interest in high-level analytical skills straddles the disciplinary divide between statistics and computational analytics, and
- those who wish to pursue further study in the field of data science after studying science, social sciences, engineering, medical sciences, information systems, computing and data analytics in their undergraduate studies.
Programme Highlights
Joint programme offered by Department of Statistics and Actuarial Science and Department of Computer Science
Interdisciplinary and comprehensive curriculum
- Solid foundation in statistical and computational analyses
- Students can select electives from Computer Science, Mathematics and Statistics
- Electives cover a broad range of contemporary topics
- Hands-on applications of methodologies with powerful software
- Capstone project with real-life scenario
Course Highlights
The core courses of the proposed MDASC programme mainly focus on both predictive and prescriptive concepts and methodologies with an effort to equip students with a solid foundation in statistical and computational analyses, e.g.
Data Science technology | Computational intelligence |
Time series forecasting | Deep learning |
The electives cover a broad range of contemporary topics and provide students with solid training in diverse and applied techniques used in data science, including but not limited to
Financial data analysis | Marketing analytics | Quantitative risk |
Data mining techniques | Network security | |
Cluster & cloud computing | Natural language processing | |
Multimedia technologies | Smart phone apps development |
Programme Curriculum
Commencing in September, the curriculum is composed of 72 credits of courses. Courses with 6 credits are offered in the first and second semesters while courses with 3 credits are normally offered in the summer semester. If a student selects a course whose contents are similar to a course (or courses) which he/she has taken in his/her previous study, the Department may not approve the selection in question. The curriculum is the same for both full-time and part-time study modes.
Compulsory Courses (36 credits) | ||
COMP7404 | Computational intelligence and machine learning (6 credits) | |
DASC7011 | Statistical inference for data science (6 credits) | |
DASC7104 |
| Advanced database systems (6 credits) |
DASC7606 | Deep learning (6 credits) | |
STAT7102 |
| Advanced statistical modelling (6 credits) |
STAT8003 | Time series forecasting (6 credits) | |
Disciplinary Electives (24 credits)* with at least 12 credits from List A and 12 credits from List B | ||
List A |
|
|
| Advanced topics in data science (6 credits) | |
Cluster and cloud computing (6 credits) | ||
COMP7409 | Machine learning in trading and finance (6 credits) | |
COMP7503 | Multimedia technologies (6 credits) | |
COMP7506 |
| Smart phone apps development (6 credits) |
COMP7507 | Visualization and visual analytics (6 credits) | |
COMP7906 | Introduction to cyber security (6 credits) | |
FITE7410 | Financial fraud analytics (6 credits) | |
ICOM6044 |
| Data science for business (6 credits) |
List B | ||
Topics in applied discrete mathematics (6 credits) | ||
MATH8503 | Topics in mathematical programming and optimization (6 credits) | |
STAT6008 | Advanced statistical inference (6 credits) | |
STAT6013 | Financial data analysis (6 credits) | |
STAT6015 | Advanced quantitative risk management and finance (6 credits) | |
STAT6016 | Spatial data analysis (6 credits) | |
STAT6019 | Current topics in statistics (6 credits) | |
STAT7008 | Programming for data science (6 credits) | |
STAT8017 | Data mining techniques (6 credits) | |
STAT8019 | Marketing analytics (6 credits) | |
STAT8306 | Statistical methods for network data (3 credits) | |
STAT8307 | Natural language processing and text analysis (3 credits) | |
* Students who have completed the same courses in their previous studies in HKU, e.g. Master of Statistics or Master of Science in Computer Science may, on production of relevant transcripts, be permitted to select up to 24 credits of disciplinary electives from either List A or List B above if they are not able to find any untaken options from either of the lists of disciplinary electives. | ||
Capstone requirement (12 credits) | ||
DASC7600 | Data Science Project (12 credits) |
COURSE DESCRIPTION
Compulsory Courses |
---|
COMP7404 Computational intelligence and machine learning (6 credits)This course will teach a broad set of principles and tools that will provide the mathematical, algorithmic and philosophical framework for tackling problems using Artificial Intelligence (AI) and Machine Learning (ML). AI and ML are highly interdisciplinary fields with impact in different applications, such as, biology, robotics, language, economics, and computer science. AI is the science and engineering of making intelligent machines, especially intelligent computer programs, while ML refers to the changes in systems that perform tasks associated with AI. Ethical issues in advanced AI and how to prevent learning algorithms from acquiring morally undesirable biases will be covered.
Pre-requisites: Nil, but knowledge of data structures and algorithms, probability, linear algebra, and programming would be an advantage.
Assessment: coursework (50%) and examination (50%) |
DASC7011 Statistical inference for data science (6 credits)Computing power has revolutionized the theory and practice of statistical inference. Reciprocally, novel statistical inference procedures are becoming an integral part of data science. By focusing on the interplay between statistical inference and methodologies for data science, this course reviews the main concepts underpinning classical statistical inference, studies computer-intensive methods for conducting statistical inference, and examines important issues concerning statistical inference drawn upon modern learning technologies. Contents include classical frequentist and Bayesian inferences, computer-intensive methods such as the EM algorithm, the bootstrap and the Markov chain Monte Carlo, large-scale hypothesis testing, high-dimensional modeling, and post-model-selection inference.
Assessment: coursework (40%) and examination (60%) |
DASC7104 Advanced database systems (6 credits)The course will study some advanced topics and techniques in database systems, with a focus on the aspects of big data analytics, algorithms, and system design & organisation. It will also survey the recent development and progress in selected areas. Topics include: query optimization, spatial-spatiotemporal data management, multimedia and time-series data management, information retrieval and XML, data mining.
Assessment: coursework (50%) and examination (50%) |
DASC7606 Deep learning (6 credits)Machine learning is a fast growing field in computer science and deep learning is the cutting edge technology that enables machines to learn from large-scale and complex datasets. Ethical implications of deep learning and its applications will be covered first and the course will focus on how deep neural networks are applied to solve a wide range of problems in areas such as natural language processing, image processing, financial predictions, game playing and robotics. Topics covered include linear and logistic regression, artificial neural networks and how to train them, recurrent neural networks, convolutional neural networks, deep reinforcement learning and unsupervised feature learning. Popular deep learning software, such as TensorFlow, will also be introduced.
Assessment: coursework (50%) and examination (50%) |
STAT7102 Advanced statistical modelling (6 credits)This course introduces modern methods for constructing and evaluating statistical models and their implementation using popular computing software, such as R or Python. It will cover both the underlying principles of each modelling approach and the model estimation procedures. Topics from: (i) Linear regression models; (ii) Generalized linear models; (iii) Model selection and regularization; (iv) Kernel and local polynomial regression; (v) Generalized additive models; (vi) Hidden Markov model and Bayesian networks.
Assessment: coursework (50%) and examination (50%) |
STAT8003 Time series forecasting (6 credits)A time series consists of a set of observations on a random variable taken over time. Such series arise naturally in climatology, economics, finance, environmental research and many other disciplines. In additional to statistical modelling, the course deals with the prediction of future behaviour of these time series. This course distinguishes different types of time series, investigates various representations for them and studies the relative merits of different forecasting procedures.
Assessment: coursework (40%) and examination (60%) |
Disciplinary Electives |
COMP7105 Advanced topics in data science (6 credits)This course will introduce selected advanced computational methods and apply them to problems in data analysis and relevant applications. |
COMP7305 Cluster and cloud computing (6 credits)This course offers an overview of current cloud technologies, and discusses various issues in the design and implementation of cloud systems. Topics include cloud delivery models (SaaS, PaaS, and IaaS) with motivating examples from Google, Amazon, and Microsoft; virtualization techniques implemented in Xen, KVM, VMWare, and Docker; distributed file systems, such as Hadoop file system; MapReduce and Spark programming models for large-scale data analysis, networking techniques in hyper-scale data centers. The students will learn the use of Amazon EC2 to deploy applications on cloud, and implement a novel cloud computing application on a Xen-enabled PC cluster as part of their term project.
Prerequisites: Students are expected to install various open-source cloud software in their Linux cluster, and exercise the system configuration and administration. Basic understanding of Linux operating system and some programming experiences (C/C++, Java or Python) in a Linux environment are required.
Assessment: coursework (50%) and examination (50%) |
COMP7409 Machine learning in trading and finance (6 credits)The course introduces our students to the field of Machine Learning, and help them develop skills of applying Machine Learning, or more precisely, applying supervised learning, unsupervised learning and reinforcement learning to solve problems in Trading and Finance.
This course will cover the following topics. (1) Overview of Machine Learning and Artificial Intelligence, (2) Supervised Learning, Unsupervised Learning and Reinforcement Learning, (3) Major algorithms for Supervised Learning and Unsupervised Learning with applications to Trading and Finance, (4) Basic algorithms for Reinforcement Learning with applications to optimal trading, asset management, and portfolio optimization, (5) Advanced methods of Reinforcement Learning with applications to high-frequency trading, cryptocurrency trading and peer-to-peer lending.
Assessment: coursework (65%) and examination (35%) |
COMP7503 Multimedia technologies (6 credits)This course presents fundamental concepts and emerging technologies for multimedia computing. Students are expected to learn how to develop various kinds of media communication, presentation, and manipulation techniques. At the end of course,students should acquire proper skill set to utilize, integrate and synchronize different information and data from media sources for building specific multimedia applications. Topics include media data acquisition methods and techniques; nature of perceptually encoded information; processing and manipulation of media data; multimedia content organization and analysis; trending technologies for future multimedia computing.
|
COMP7506 Smart phone apps development (6 credits)Smart phones have become very popular in recent years.According to a study, by 2020, 70% of the world's population is projected to own a smart phone, an estimated total of almost 6.1 billion smartphone users in the world.
Smart phones play an important role in mobile communication and applications.
Smart phones are powerful as they support a wide range of applications (called apps). Most of the time, smart phone users just purchase their favorite apps wirelessly from the vendors. There is a great potential for software developer to reach worldwide users. This course aims at introducing the design issues of smart phone apps. For examples, the smart phone screen is usually much smaller than the computer monitor. We have to pay special attention to this aspect in order to develop attractive and successful apps. Various modern smart phone apps development environments and programming techniques (such as Java for Android phones, and Swift for iPhones) will also be introduced to facilitate students to develop their own apps. Prerequisites: Students should have basic programming knowledge Assessment: coursework (50%) and examination (50%) |
COMP7507 Visualization and visual analytics (6 credits)This course introduces the basic principles and techniques in visualization and visual analytics, and their applications. Topics include human visual perception; color; visualization techniques for spatial, geospatial and multivariate data, graphs and networks; text and document visualization; scientific visualization; interaction and visual analysis.
Assessment: coursework (50%) and examination (50%) |
COMP7906 Introduction to cyber security (6 credits)The aim of the course is to introduce different methods of protecting information and data in the cyber world, including the privacy issue. Topics include introduction to security; cyber attacks and threats; cryptographic algorithms and applications; network security and infrastructure. Pre-requisites: Students should not have taken ICOM6045 Fundamentals of e-commerce security or equivalent
|
FITE7410 Financial fraud analytics (6 credits)This course aims at introducing various analytics techniques to fight against financial fraud. These analytics techniques include, descriptive analytics, predictive analytics, and social network learning. Various data set will also be introduced, including labeled or unlabeled data sets, and social network data set. Students learn the fraud patterns through applying the analytics techniques in financial frauds, such as, insurance fraud, credit card fraud, etc.
Key topics include: Handling of raw data sets for fraud detection; Applications of descriptive analytics, predictive analytics and social network analytics to construct fraud detection models; Financial Fraud Analytics challenges and issues when applied in business context. Pre-requisites: Students should have basic knowledge about statistics concepts
|
ICOM6044 Data science for business (6 credits)The emerging discipline of data science combines statistical methods with computer science to solve problems in applied areas. In this case we focus on how data science can be used to solve business problems especially those in electronic commerce. By its very nature e-commerce is able to generate large amounts of data and data mining methods are quite helpful for managers in turning this data into knowledge which in turn can be used to make better decisions. These data sets and their accompanying quantitative methods have the potential to dramatically change decision making in many areas of business. For example, ideas like interactive marketing, customer relationship management, and database marketing are pushing companies to utilize the information they collect about their customers in order to make better marketing decisions. This course focuses on how data science methods can be applied to solve managerial problems in marketing and electronic commerce. Our emphasis is developing a core set of principles that embody data science: empirical reasoning, exploratory and visual analysis, and predictive modeling. We use these core principles to understand many methods used in data mining and machine learning. Our strategy in this course is to survey several popular techniques and understand how they map into these core principles. These techniques are illustrated with case studies. However, the emphasis is not on the software for implementing these techniques but on understanding the inputs and outputs of these techniques and how they are used to solve business problems. Pre-requisites: Students should not be taking or have taken STAT8017 Data mining techniques or equivalent Assessment: coursework (65%) and examination (35%) |
MATH8502 Topics in Applied Discrete Mathematics (6 credits)This course aims to provide students with the opportunity to study some further topics in applied discrete mathematics. A selection of topics in discrete mathematics applied in combinatorics and optimization (such as algebraic coding theory, cryptography, discrete optimization, etc.) The selected topics may vary from year to year.
Pre-requisites: Knowledge in introductory discrete mathematics. Students may be asked to present appropriate evidence of having met the pre-requisites for enrolling in this course.
Assessment: coursework (50%) and examination (50%) |
MATH8503 Topics in Mathematical Programming and Optimization (6 credits)A study in greater depth of some special topics in mathematical programming or optimization. It is mainly intended for students in Operations Research or related subject areas. This course covers a selection of topics which may include convex programming, nonconvex programming, saddle point problems, variational inequalities, optimization theory and algorithms suitable for applications in various areas such as machine learning, artificial intelligence, imaging and computer vision. The selected topics may vary from year to year.
Pre-requisites: Knowledge in introductory mathematical programming and optimization. Students may be asked to present appropriate evidence of having met the pre-requisites for enrolling in this course.
Assessment: coursework (100%) |
STAT6008 Advanced statistical inference (6 credits)This course covers the advanced theory of point estimation, interval estimation and hypothesis testing. Using a mathematically-oriented approach, the course provides a formal treatment of inferential problems, statistical methodologies and their underlying theory. It is suitable in particular for students intending to further their studies or to develop a career in statistical research. Contents include: (1) Decision problem – frequentist approach: loss function; risk; decision rule; admissibility; minimaxity; unbiasedness; Bayes’ rule; (2) Decision problem – Bayesian approach: prior and posterior distributions, Bayesian inference; (3) Estimation theory: exponential families; likelihood; sufficiency; minimal sufficiency; completeness; UMVU estimators; information inequality; large-sample theory of maximum likelihood estimation; (4) Hypothesis testing: uniformly most powerful (UMP) test; monotone likelihood ratio; UMP unbiased test; conditional test; large-sample theory of likelihood ratio; confidence set; (5) Nonparametric inference; bootstrap methods.
Assessment: coursework (40%) and examination (60%) |
STAT6013 Financial data analysis (6 credits)This course aims at introducing statistical methodologies in analyzing financial data. Financial applications and statistical methodologies are intertwined in all lectures. Contents include: recent advances in modern portfolio theory, Copula, market microstructure and high frequency data analysis, FinTech applications with various computational tools such as artificial neural networks, Kalman filters and blockchain data analysis.
Assessment: coursework (40%) and examination (60%) |
STAT6015 Advanced quantitative risk management and finance (6 credits)This course covers statistical methods and models of importance to risk management and finance and links finance theory to market practice via statistical modelling and decision making. Emphases will be put on empirical analyses to address the discrepancy between finance theory and market data. Contents include: Elementary Stochastic Calculus; Basic Monte Carlo and Quasi-Monte Carlo Methods; Variance Reduction Techniques; Simulating the value of options and the value-at-risk for risk management; Review of univariate volatility models; multivariate volatility models; Value-at-risk and expected shortfall; estimation, back-testing and stress testing; Extreme value theory for risk management.
Assessment: coursework (25%) and examination (75%) |
STAT6016 Spatial data analysis (6 credits)This course covers statistical concepts and tools involved in modelling data which are correlated in space.Applications can be found in many fields including epidemiology and public health, environmental sciences and ecology, economics and others. Covered topics include: (1) Outline of three types of spatial data: point-level (geostatistical), areal (lattice), and spatial point process. (2) Model-based geostatistics: covariance functions and the variogram; spatial trends and directional effects; intrinsic models; estimation by curve fitting or by maximum likelihood; spatial prediction by least squares, by simple and ordinary kriging, by trans-Gaussian kriging. (3) Areal data models: introduction to Markov random fields; conditional, intrinsic, and simultaneous autoregressive (CAR,IAR, and SAR) models.(4) Hierarchical modelling for univariate spatial response data, including Bayesian kriging and lattice modelling. (5) Introduction to simple spatial point processes and spatio-temporal models. Real data analysis examples will be provided with dedicated R packages such as geoR.
Assessment: coursework (50%) and examination (50%) |
STAT6019 Current topics in statistics (6 credits)This course includes two modules. The first module, Causal Inference, is an introduction to key concepts and methods for causal inference. Contents include 1) the counterfactual outcome, randomized experiment, observational study; 2) Effect modification, mediation and interaction; 3) Causal graphs; 4) Confounding, selection bias, measurement error and random variability; 5) Inverse probability weighting and the marginal structural models; 6) Outcome regression and the propensity score; 7) The standardization and the parametric g-formula; 8) G-estimation and the structural nested model; 9) Instrumental variable method; 10) Machine learning methods for causal inference; 11) Other topics as determined by the instructor. The second module, Posterior Inference and Simulation, cover topics from: 1) Large-sample properties of posterior distribution; 2) Langevin dynamics and Hamiltonian MCMC; 3) Sequential Monte Carlo methods; 4) Approximation Bayesian computation; 5) Variational Bayesian methods; 6) Other topics as determined by the instructor.
Assessment:coursework (25%) and examination (75%) |
STAT7008 Programing for data science(6 credits)In the big data era, it is very easy to collect huge amounts of data. Capturing and exploiting the important information contained within such datasets poses a number of statistical challenges. This course aims to provide students with a strong foundation in computing skills necessary to use R or Python to tackle some of these challenges. Possible topics to be covered may include exploratory data analysis and visualization, collecting data from a variety of sources (e.g. excel, web-scraping, APIs and others), object-oriented programming concepts and scientific computation tools. Students will learn to create their own R packages or Python libraries.
Assessment: coursework (100%) |
STAT8017 Data mining techniques (6 credits)With the rapid developments in computer and data storage technologies, the fundamental paradigms of classical data analysis are mature for change. Data mining techniques aim at helping people to work smarter by revealing underlying structure and relationships in large amounts of data. This course takes a practical approach to introduce the new generation of data mining techniques and show how to use them to make better decisions. Topics include data preparation, feature selection, association rules, decision trees, bagging, random forests and gradient boosting, cluster analysis, neural networks, introduction to text mining.
Pre-requisites: Students should not be taking or have taken ICOM6044 Data science for business or equivalent Assessment: coursework (100%) |
STAT8019 Marketing analytics (6 credits)This course aims to introduce various statistical models and methodology used in marketing research. Special emphasis will be put on marketing analytics and statistical techniques for marketing decision making including market segmentation, market response models, consumer preference analysis and conjoint analysis. Contents include market response models, statistical methods for segmentation, targeting and positioning, statistical methods for new product design. Assessment: coursework (40%) and examination (60%) |
STAT8306 Statistical methods for network data (3 credits)The six degree of separation theorizes that human interactions could be easily represented in the form of a network. Examples of networks include router networks, the World Wide Web, social networks (e.g. Facebook or Twitter), genetic interaction networks and various collaboration networks (e.g. movie actor coloration network and scientific paper collaboration network). Despite the diversity in the nature of sources, the networks exhibit some common properties. For example, both the spread of disease in a population and the spread of rumors in a social network are in sub-logarithmic time. This course aims at discussing the common properties of real networks and the recent development of statistical network models. Topics may include common network measures, community detection in graphs, preferential attachment random network models, exponential random graph models, models based on random point processes and the hidden network discovery on a set of dependent random variables.
Assessment: coursework (50%) and examination (50%) |
STAT8307 Natural language processing and text analytics (3 credits)The textual data constitutes an enormous proportion of unstructured data which is characterized as one of ‘V’s in Big Data. The logical and computational reasonings are applied to transform large collection of written resources to structured data for use in further analysis, visualization, integration with structured data in database or warehouse, and further refinement using machine learning systems. This course introduces the methodology of text analytics. Topics include natural language processing, word representation, text categorization and clustering, topic modelling and sentiment analysis. Students are required to possess basic understanding of Python language.
Assessment: coursework (100%) |
Capstone Requirement |
DASC7600 Data science project (12 credits)Candidate will be required to carry out independent work on a major project under the supervision of individual staff member. A written report is required.
Assessment: written report (75%) and oral presentation (25%) |
Programme Director
Professor G S Yin
MA Temple; MSc, PhD N Carolina
Patrick S C Poon Professor in Statistics and Actuarial Science
Department of Statistics and Actuarial Science
- mdasc@hku.hk
- 3917 4152
Staff List
Department of Statistics and Actuarial Science | Department of Computer Science |
---|---|
|
|
|
Graduate/ Student Sharing

Ting Hin CHEUNG
Part-time student
"Being a part-time MDASC student with no statistic nor programming background, it was definitely a challenging yet fruitful experience so far. After over a decade working in the sales and trading industry, I felt refreshing back to the campus and studying all the fancy formula and symbols again.
I would say the courses are much more demanding than I expected, but thanks to the summer preparation classes, it helped to recall my memories on some basic concepts."
Enquiries
Ms Aka Lee Department of Statistics and Actuarial Science
| Faculty of Science The University of Hong Kong
|