We sat down with Professor Kathy McKeown, the director of the Data Science Institute and a Henry and Gertrude Rothschild Professor of Computer Science here at Columbia to learn about her past and some of her current work in the new Institute.
One thing most students might not know is that you majored in Comparative Literature as an undergraduate. What led you to move from Comparative Literature into Computer Science? How has your undergraduate major helped you in your current work?
When I was an undergraduate, I was torn between literature and math. I proceeded along both paths and chose literature primarily because of a very persuasive faculty member. I only discovered CS towards the end of my time at Columbia and I liked it. When I graduated, I talked to a friend who told me about Comparative Linguistics, which was a mixture of the things I loved. I spent time reading about it and knew it was something I wanted to do.
Do you have any plans to further develop Newsblaster? How long did it take you to develop it? What influenced you to make it?
Newsblaster was developed over a five to ten year period by a team that had over ten students working on it at a time. I realized that text summarization was going to be possible and I wanted to be the first to get in there.
I got a grant to do it. We worked primarily on news and it was nice to get daily news updates. It took a while to refine what it looked like, but what I essentially had was a platform for resolving problems with research and collaboration.
I have students working on updating it. We’re working on modernizing it and connecting it to RSS feeds and updating the summarization outcomes. I would like for these updates to come in the fall.
What lead you into Data Science? As the Director of the Columbia Institute for Data Sciences and Engineering, what are some of your long-term goals for the institute?
Data Science is closely related to research in Natural Language Processing (NLP) and has been for years. The whole research area of NLP really connects, and we use statistical methods for analyzing it. I am very interested in interdisciplinary work, and data science at its core is interdisciplinary. It’s a combination of machine learning, CS, and statistics. In the context of different disciplines and problems, it really brings people together. I really enjoy taking something on, creating something new, and bringing together a group of people.
We now have two academic certifications intended for working people who want to transition into data science. We also have a master’s program in Data Science. We have started working on a PhD that we plan to offer in fall 2017. We’re working on making an undergraduate major in Data Science so that undergraduates can be involved. There is also the Columbia Data Science Society that works to create opportunities for students to do hackathons, participate in research, and more.
Students can get involved by getting degrees and participating in activities. We want to help students get connected to companies that want to hire them.
The institute is organized into centers focused on certain themes and applications of data science. These themes and applications include new media, health analytics, smart cities, core theory, and cyber security. Faculty can either be involved through research, teaching students, and by helping plan activities.
If students express interest, we will make an introductory class.
How do you see the field of data science growing over the next few years? Do you think Columbia should offer a Data Science major for undergraduate students?
I think Data Science is exploding. New institutes are coming every week. Columbia is a leader in this because we started early. I see it exploding on campus too; there’s a lot of interest across campus and we’re looking at how to set up interdisciplinary courses between departments. We’ve been and are continuing to hire data science faculty and we have new academic programs coming.
This year we have 90 students in the master’s program, which is fairly high for a new program.
What is one of your greatest challenges as a Professor?
One, having the freedom to choose research to work on and shape. And two, working with students. I have always liked working with students and today that is still my favorite part. I love working with my PhD students. I’ve missed teaching students since I have been working on the Institute.
What advice would you give to current students uncertain about what they want to do after college?
I would say that your undergrad years are your chance to learn about what you want to learn and what you want to do. I know in some ways this is not helpful, but I recommend doing things that interest you rather than the things you think will get you a job. The world changes and the job market does too… it’s always possible to wait until your junior or senior year to begin looking at real world possibilities. You can always add new courses to help you. Everybody changes, from changing disciplines to taking new journeys. I have students who enjoy creative writing and are now into NLP. No one should feel that, when making their choices, their decision is forever. You should do what you love and things will work out.