学术报告(Nitesh Chawla)

题目：Learning from Imbalanced Data

报告人：Nitesh Chawla
Assistant Professor
Computer Science and Engineering Department, University of Notre Dame, USA

时间：10月29日（星期四） 14:30-15:30

地点：蒙民伟楼404会议室

摘要：
Recent years brought increased interest in applying data mining techniques to
difficult "real-world" problems, many of which are characterized by imbalanced
learning data, where at least one class is under-represented relative to
others. Examples include (but are not limited to): fraud/intrusion detection,
risk management, medical diagnosis/monitoring, bioinformatics, text
categorization and personalization of information. The problem of imbalanced
data is also often associated with asymmetric costs of misclassifying elements
of different classes. In this talk, I will present our work on finding
problems in, proposing solutions to, and performing analysis on imbalanced
data.

简历：
Nitesh Chawla is an Assistant Professor in the Department of Computer Science
and Engineering at the University of Notre Dame. He directs the Data Inference
Analysis and Learning Lab (DIAL) and co-directs the Interdisciplinary Center
of the Network Science and Applications (iCenSA) at Notre Dame. His research
is primarily focused on machine learning, data mining, and complex networks.
His work has led to applications in various domains including climate data
sciences, biology, medicine, finance, security, and social science. He is on
the editorial board of IEEE Transactions on Systems, Man and Cybernetics Part
B, and has served/serving on the program and organizational committees for a
number of top-tier conferences. He has received various awards and honors,
including the best dissertation, best papers, outstanding undergraduate
teacher, and the NAE New Faculty Fellowship. His current research is
supported form NSF, DOD, NWICG, NIJ, and industry sponsors

题目：A framework for monitoring classifiers' performance: when and why failure
occurs?

报告人：Nitesh Chawla
Assistant Professor
Computer Science and Engineering Department, University of Notre Dame, USA

时间：10月30日（星期五） 14:30-15:30

地点：蒙民伟楼404会议室

摘要：
Classifier error is the product of model bias and data variance. While
understanding the bias involved when selecting a given learning algorithm, it
is similarly important to understand the variability in data over time, since
even the One True Model might perform poorly when training and evaluation
samples diverge. Thus, the ability to identify distributional divergence is
critical towards pinpointing when fracture points in classifier performance
will occur. Contemporary evaluation methods do not take the impact of
distribution shifts on the quality of classifiers’ predictions. In this talk,
I present a comprehensive framework to proactively detect breakpoints in
classifiers’ predictions and shifts in data distributions through a series of
statistical tests. I outline and utilize three scenarios under which data
changes: sample selection bias, covariate shift, and shifting class priors.

supported form NSF, DOD, NWICG, NIJ, and industry sponsors.