Interview with Virginia Ma about her 2023 Concurrent Masters Prize

February 1, 2024
Ginnie Ma and Jason Zhou receive awards

In May 2023, Virginia (Ginnie) Linqian Ma won the inaugural Department of Statistics Concurrent Masters Prize (along with Jason Zhou ’23 AB/AM).  This prize is awarded annually to the graduating master’s (AM) student who has the best overall performance in coursework, has demonstrated achievements in Statistics outside of coursework, and has contributed significantly to the department.  In following interview (excerpted and edited) from the summer (pardon the publishing delay!), Ginnie speaks with us about her journey as a master’s student through the Stats Department.  Read more to learn about Ginnie’s favorite courses, thesis project, and inspiration for starting GUSH and the Florence Nightingale Day (a day of statistics outreach for middle and high school students).

1. When did you first become interested in statistics?

Ma:  In high school, I was very lucky that we had the opportunity to take a Stats class – not your typical AP stats class – with a professor from a local university whose daughter went to our school.  The course incorporated calculus and was similar in flavor to Statistics 110 [Probability] and 111 [Introduction to Statistical Inference].  I knew I already enjoyed taking math classes, but statistics interested me because you could use it to make sense of real-world phenomena.  The second experience that I had in high school (in my junior year) was a summer research project working with gene expression data from pancreatic cancer patients.  Working on the project made me realize that I was mostly interested in what was going on under the hood; I wanted to know why these methods worked and the principles behind them.

From these past experiences, I knew I valued a more theoretical perspective and was interested in the interplay between my pure math and statistics courses in college.  As a math concentrator, I learned how to think critically; writing a proof is different from writing anything else.  After taking a lot of pure math classes, I could synthesize the proof techniques and apply them to my work in statistics, particularly in proof-based stats classes.  As I progressed in math and statistics, I became more interested in pursuing 200-level stats courses and the master’s program in statistics.  We're very lucky that the department gives us this unique opportunity to receive a concurrent master’s degree.

2. What were a few highlights as a master's student in the program? Are there certain mentors, courses or parts of the program that really shaped your experience and interest in the field?

Ma:  This is probably a frequent answer, but I think the best way to introduce someone to Stats Department courses is by taking the Stat 110 and Stat 111.  The year I took Stat 111, it was co-taught by Prof. Joe Blitzstein and Prof. Neil Shephard, which produced an amazing classroom dynamic.  I took several classes with Joe, and his dedication to making sure that all students enjoy and engage with the material was essential for making the Stats Department a welcoming place for me – a place where I could see myself tackling difficult problems.  Also, Joe introduced me to the beauty and elegance of statistics in Stat 210 [Probability I].  For example, the course taught us the technique of using representations to solve for a distribution function.  Typically, to relate the distributions of two random variables, you need to use a tedious amount of algebra or calculus.  Using a representation simplifies the solution and makes it more concise (instead of writing out pages of work!), which was incredibly eye opening and satisfying for me.

Another class that I really enjoyed was Stat 171 [Introduction to Stochastic Processes] with Professor Subhabrata Sen. Stochastic processes are probabilistic models for random quantities that typically evolve over time or space. The course was a great example of a class that balances theory and coding; when learning about stochastic processes, such as Markov Chain Monte Carlo, we needed to use both theoretical and computational skills to solve the problems.  

3. How did you select your thesis project? What questions were you asking?

Ma:  I selected Prof. Lucas Janson for my math thesis advisor because I was interested in research in statistical inference (I also enjoyed his Stat 211 course on this topic) and because I had heard he was a great advisor from other students.  I reached out to him to discuss a potential research project; it seemed like a good fit, so I went for it!  Throughout the process, Lucas was an excellent mentor because he struck a nice balance between giving me independence to explore on my own and providing helpful guidance to make sure that I didn’t stray too far off track.

My math thesis was titled “Uncertainty Quantification with Empirical Bayes.”  Uncertainty quantification is useful for capturing the degree of uncertainty associated with models, estimates, and other quantities of interest.  Let’s say that you want to estimate the average height of a population.  Because you can’t perform a census for every single person, you take a sample of people’s heights and average them.  However, your sample might be quite small or not very representative of the general population, so it can be beneficial to provide a range instead of a point estimate (e.g., estimating the average height is between 5’2’’ and 5’4’’ instead of estimating 5’3’’).  High dimensional data presents the challenge of identifying which covariates are relevant to the outcome variable.  After identifying relevant covariates, you can perform inference such as hypothesis testing or uncertainty quantification.  One appealing approach toeing the line between the two primary frameworks of statistical inference—frequentist and Bayesian—is Empirical Bayes, which shares properties of Bayesian methods, such as allowing for the use of data multiple times in constructing a model and performing analysis, but has weaker prior assumptions. My thesis examined whether Empirical Bayes Methods could help with the accuracy of uncertainty quantification in these high dimensional settings.  The goal was to use mathematical and statistical tools to analytically evaluate these methods and then see if they produced better interval estimates.  

4. What are a few things that you will miss the most about your Harvard experience?

Ma: The Stats Department was a special community for me in many ways. One aspect that I appreciated was that the PhD students were incredibly friendly and kind, which was something I wasn’t expecting because they are older students with more established paths in statistics. Once I started taking higher level classes, the PhD students were amazing classmates, too, and I learned a lot from them.

Another part of the Harvard community that I miss is GUSH [Group for Undergraduates in Statistics at Harvard], which was a big part of my college experience.  As you know, Prof. Blitzstein helped Rachel [2023 stats concentrator alum Rachel Li] and me to get GUSH off the ground. The first official event in May 2020 was an alumni panel on Zoom, which had a great mix of speakers and strong attendance.  zThis was an exciting event to launch the group because we saw all these students come together after being sent home at the beginning of the pandemic.  GUSH was a way to stay in touch with people and remain integrated with the community.  A second favorite GUSH event would be the women in statistics panels that we’ve hosted for two years.  Held in conjunction with Women’s Week (hosted by the Harvard Women’s Center), these career panels present a great opportunity for students to hear from panelists at different stages in their career about varied career paths and challenges faced by women statisticians.

Another event that I was proud of working on was the Florence Nightingale Day in the Stats Department in October 2022.  [Introduced by the American Statistical Association and the Caucus for Women in Statistics (CWS) in 2018, FND is part of an international effort to celebrate women in statistics, biostatistics, and data science and to cultivate student engagement in these fields.  In fall 2023, we hosted a similar event for high school students from Boston and Cambridge public schools called Data Adventure Day].  I had participated in a Florence Nightingale Day [FND] when I was in high school and was interested in initiating one at Harvard, but it wasn’t until I reached out to Prof. Kelly McConville that the initiative launched.  Kelly was very invested in educational initiatives, particularly those focused on equity.  When I emailed her about FND, she sent me a Google Doc full of ideas that same weekend!  With her help and the other committee members, we were able to bring about 50 middle and high school students to campus for a full day of statistics activities. 

5. What are you looking forward to? What are your short-term goals?

Ma: This summer, I’m looking forward to relaxing and recharging.  One of the activities that I picked up last summer at Harvard was running along the Charles River.  While I ran track in high school and always thought of myself as a “short distance” person, I’m working on building up my stamina and running longer distances, which I hope to do at home this summer and at school in the fall.  In addition, I look forward to enjoying the outdoors more by going hiking with my family and gardening.  Lastly, I’m going to try to read more this summer.  Joe gave me a book that details advice from statisticians about their tips for graduate school and beyond, so I’m planning on reading the book before I move across the country to start my PhD program in Statistics at Stanford. Actually, the first time that I ever traveled to California was during the visit weekend for incoming students; I’m excited about moving and getting started!