Distinguished Lecture - Veridical Data Science: the practice of responsible data analysis and decision-making

Date & Time: May 21, 2021 (Fri) | 10-11am (HKT)

Venue: ZOOM online lecture (https://bit.ly/3wRQPGJ)

Speaker: Professor Bin YU
Chancellor's Distinguished Professor, Departments of statistics and EECS, UC Berkeley

poster of lecture on May 21 by Prof. Bin Yu

"AI is like nuclear energy – both promising and dangerous“ – Bill Gates, 2019.

Data Science is a pillar of Artificial Intelligence (AI) and has driven most of recent cutting-edge discoveries in biomedical research. In practice, Data Science has a life cycle (DSLC) that includes problem formulation, data collection, data cleaning, modelling, result interpretation and the drawing of conclusions. Human judgement calls are ubiquitous at every step of this process, e.g. in choosing data cleaning methods, predictive algorithms and data perturbations. Such judgment calls are often responsible for the "dangers" of AI.

To maximally mitigate these dangers, a framework was developed based on three core principles: Predictability, Computability and Stability (PCS). Through a workflow and documentation (in R Markdown or Jupyter Notebook) that allows one to manage the whole DSLC, the PCS framework unifies, streamlines and expands on the best practices of machine learning and statistics – bringing us a step forward towards veridical Data Science. Professor Yu will illustrate the PCS framework in the modelling stage through the development of DeepTune images for characterisation of neurons in the difficult V4 area of primary visual cortex.

Playback video:

Speaker Professor Bin YU

Chancellor's Distinguished Professor, Departments of statistics and EECS, UC Berkeley

Professor Bin YU is Chancellor's Distinguished Professor and Class of 1936 Second Chair in the departments of statistics and EECS at UC Berkeley. She leads the Yu Group which consists of 15-20 students and postdocs from Statistics and EECS.

She was formally trained as a statistician, but her research extends beyond the realm of statistics. Together with her group, her work has leveraged new computational developments to solve important scientific problems by combining novel statistical machine learning approaches with the domain expertise of her many collaborators in neuroscience, genomics and precision medicine. She and her team develop relevant theory to understand random forests and deep learning for insight into and guidance for practice.

She is a member of the US National Academy of Sciences and of the American Academy of Arts and Sciences. She is Past President of the Institute of Mathematical Statistics (IMS), Guggenheim Fellow, Tukey Memorial Lecturer of the Bernoulli Society, Rietz Lecturer of IMS, and a COPSS E. L. Scott prize winner. She is serving on the editorial board of Proceedings of National Academy of Sciences (PNAS) and the scientific advisory committee of the UK Turing Institute for Data Science and AI.

Back