Professor Tracy Ke is Interviewed for the International Day of Women in Statistics and Data Science

March 7, 2023
Ke

The Statistics Department is featuring four statisticians over several months as part of our series celebrating the inaugural International Day of Women in Statistics and Data Science (October 11th, 2022), as well as International Women's Day on March 8th, 2023. Introduced by the Caucus for Women in Statistics, the Portuguese Statistical Society, and the American Statistical Association, the day celebrates the research contributions of women statisticians and data scientists around the world.  

While our last interview featured Assistant Professor Morgane Austern, the following interview, excerpted and revised, is with Assistant Professor Tracy Ke.  During the interview, Professor Ke highlights her early discovery of the beauty, elegance, and power of statistical thinking and her hope that the increase in women faculty role models will encourage more young women to go into the field.  Professor Ke also shares the exciting results of a recent project that uses statistical methods on text in journals to study trends in different disciplines.

1)  Please introduce yourself (your position at Harvard and a summary of your current areas of research).

Ke: My research is mainly on high-dimensional statistics [high dimensional data occurs when the number of features in a dataset is higher than the number of observations], machine learning, social network analysis, as well as statistical text mining, which involves developing statistical models and methods for extracting information of interest from text.

 

2)  What inspired you to go into this field?  Were there experiences in your education and early career as a researcher that stood out?

Ke: When I was a major in mathematics and physics at Tsinghua University, I took a course called Physical Experiments that started my interest in data and statistics.  In the course, we performed one experiment per week to test physical laws in thermodynamics, electromagnetism, optics, and acoustics, etc., and then would write our analysis in a report.  The instructor spent a whole class talking about how to use the least squares method to find a best-fitted line that minimized the distance between the fitted and actual datapoints.  When teaching this method, the instructor reminded us to be careful in making inferences about the data and to check for issues that might skew our results, such as small sample size or collinearity [when one variable in a regression model is highly correlated with another]. I also learned that when analyzing experimental results, it’s crucial to report the p-value, and, if the p-value is large, to avoid rushing into conclusions.  This was the first time I realized the value of statistics; I understood that the method of analysis would significantly affect conclusions about the data (and the trustworthiness of these conclusions!).

 

As a graduate student at Princeton, I began to understand why statistical methods, such as sure independence screening, are so useful and why the ideas behind the methods are so elegant and beautiful.  Dr. Jianqing Fan [Professor Ke’s dissertation advisor] exposed me to this foundational concept to use it in my research. One time, when I was sitting on a bus with a friend, who came from a non-science background, I explained to him this concept and its significance in plain language.  I said, “Imagine that you want to feed a model, but the dataset includes a huge number of explanatory variables and a limited sample size, which seems to make this impossible.  However, if you have prior knowledge that many of the variables are not useful, and if you have a way of screening out those variables, then you can improve the model fitting significantly.” With this explanation, I convinced my friend that these statistical methods were important.  When I started to see others’ positive perspectives on my work, particularly the perspectives of scholars outside of my field, my confidence and enthusiasm for statistics grew.

 

3)  Describe a recent project that has been exciting to work on. Why is this an important project?

Ke:  In a recent project with collaborators, I developed a template for using statistical methods for analyzing bibliometric data, such as the text abstracts and citations in journals.  Our basic question was How can we use statistical methods to analyze bibliometric data for developments in a scientific field?  As part of the project, we collected, cleaned, and made available to the public the bibliometric data of over eighty-three thousand papers published in statistics-related journals.  By using natural language processing [using computational techniques to enable computers to process language], we retrieved a lot of information from abstracts in journals and networks of researchers, such as co-authorship and citation networks.  In the process, we identified the popular topics or sub-fields of statistics and determined the community structures among statisticians.  This project was an important milestone because it was the first time that I both created the dataset and generated the research questions (in statistics, it’s quite common to answer the research questions posed by scientists in other disciplines).  I also found it incredibly rewarding to see how my previous work on statistical theory and methods could successfully be applied to real-world data.

 

4)  What steps do you think are effective for recruiting and/or retaining women in your field?

Ke:  To recruit more women students into statistics and STEM-related PhD programs and to encourage them to consider an academic career, it’s important to share the success of women in our field.  For example, in our department, we have a group of very talented women researchers who are role models for our undergraduate and graduate students who are in the process of deciding their future careers.  Role models are powerful because they can prove wrong the negative bias that young women might encounter from their family, peers, or others and can persuade young women to not give up on a career in research.

 

5) What advice would you give to young women (in grade school, college, and beyond) who are interested in pursuing research in your field?

Ke:  My suggestion for incoming women researchers is to not be afraid of failing or of seeking help.  Throughout my research career, I have had a lot of unsuccessful attempts, for example, when a research project didn’t go well, or a paper didn’t receive positive feedback from reviewers.  This kind of disappointment can undermine people’s confidence in their academic career potential and can discourage them from pursuing their research projects of interest.  To learn how to deal with these frustrations without getting discouraged, it’s important for young women researchers to seek advice from senior faculty.

Stats:  Students and researchers starting out in their careers can learn a lot from Professor Ke’s advice to seek help and to embrace failure as a learning opportunity.  We appreciate Professor Ke taking the time to share with us what inspired her career path and how she hopes to inspire the next generation.

 

Please stay tuned for the last article in our series on PhD student Kelly Zhang, co-advised by Professor Susan Murphy and Assistant Professor Janson.