Untangling the Web A PODCAST OF THE WEB SCIENCE TRUST
Deborah McGuinness: So I’m kind of famous for this wine and foods ontology that I literally did in my very early days, when I was taking a graduate class. You know, I had to write an expert system that would make a recommendation. And so I said, “Okay, well, what am I passionate about?” Well, I happen to be passionate about wine and food.
My guest today is Deborah McGuinness, who you just heard talking about creating ontologies for computers. These ontologies can help us pair the perfect glass of wine with our steak – or develop personal health management plans. Deborah is the Tetherless World Senior Constellation Chair and Professor of Computer, Cognitive, and Web Sciences at Rensselaer Polytechnic Institute, or RPI, in the United States. She is also the founding director of the Web Science Research Center and is a fellow of the American Association for the Advancement of Science. She’s also the recipient of the Robert Engelmoore Award from the Association for the Advancement of Artificial Intelligence. Welcome, Deb.
Deborah McGuinness: Thanks for that wonderful introduction. It’s wonderful to be here.
Noshir Contractor: So I have to say that the title of your chair intrigues me. Tell us more about the Tetherless World Constellation.
Deborah McGuinness: Well, a constellation is a feature that our university president put together. Usually, universities have one professor in one area and then don’t have overlapping professors. But her idea was to bring together constellations or groups of stars to make significant contributions in carefully chosen areas. So the original plan was for this contribution to be in, kind of, mobile computing and the future of the web. And then we modified that some to be really, less about mobility, and more about the future of how we work with tetherless communications, as well as tethered communications. I typically refer to that as the future of the web.
Noshir Contractor: That’s a fascinating vision.
Deborah McGuinness: Yes. One of the reasons I left my position directing the Knowledge Systems Lab at Stanford University was because of the interdisciplinary nature and strengths of RPI. I find that my most fascinating work is at the intersection of communities. And actually, that kind of is a perfect tie in to web science, because I don’t think I know of any discipline that’s more interdisciplinary than web science.
Noshir Contractor: That’s absolutely where I wanted to go next, given your interest and skills at being able to navigate interdisciplinary work. You’ve been one of the pioneers in this area, so take us back a little bit to how you got started in the area of web science.
Deborah McGuinness: Well, you know, I’ve been working in knowledge representation and reasoning and the languages and environments to model and reason with knowledge for my entire career. So in the early days, that was languages literally for the Semantic Web, but it was before we called it the Semantic Web. So it was languages that let you get to the implicit information from the explicit statements and were computationally amenable to working with computers. Then, when I went to Stanford, we did a lot of really big, often government-sponsored projects, to do ontology-enabled, or encodings, of meaning. So we did large applications that understood what terms meant, because we encoded those meanings in ontologies. And so, you know, for my entire career, I was making these languages and I was making these environments that were making kind of smart recommender systems or smart data portals. And then when I went to RPI, we kind of took that to another level and made it even bigger. And so when web science was emerging, they needed people who had languages that could not just encode how you’re going to write something on a page, or how you’re going to link one page to another, but actually, what those terms in the page mean. And then also, as I mentioned earlier, I’m really just fascinated by interdisciplinary work. And this just seemed to be a complete and total perfect match for that.
Noshir Contractor: So I’m going to take you back and try to help unpack some terms that you use in the context of web science, for somebody who may not be familiar. You use the words knowledge representation, language, ontology. By language, you mean computer languages, I guess?
Deborah McGuinness: Yes, I typically mean languages for computers. We might focus more on markup languages, so languages that help you annotate terms that you’re going to see in a description of something. And that has initially been, well, “I’m going to write this in red in a particular font.” You know, I’m kind of famous for this wine and foods ontology that I literally did in my very early days when I was taking a graduate class. You know, I had to write an expert system that would make a recommendation. And so I said, “Okay, well, what am I passionate about?” Well, I happen to be passionate about wine and food. Later, we called it The Semantic Sommelier. Someone would say “I’m having steak for dinner.” So we also had some rolls in the background that said, with a meat dish without a spicy sauce, we might have a red, full-bodied, dry wine. Once we’ve got that markup, and I’ve got, say, Forman Cabernet Sauvignon in my database, and then we can retrive, not just that particular wine, but we can also retrieve the description of the wine. So let’s say I’m in a restaurant and they don’t have that wine. I can say to the sommelier, “Well, do you have any other red, full-bodied, dry wines?” And then they could list off the ones that match that description.
Noshir Contractor: And so in the context of a recommender system for a search, you would say, “Show me wines that have a certain quality.” And if the information about the wine is encoded in a markup language that includes those characteristics, then rather than just search for the word Sauvignon Blanc, you will now be able to get a Sauvignon Blanc recommendation based on certain attributes of the wine that have been encoded into the website. Can you amplify it in a more accessible manner than I just fumbled through that?
Deborah McGuinness: Oh, well, actually, I thought you did a pretty good job. So I created this wines ontology and this foods ontology. I made it public. And I was also very active in the description logic community. And so at the time, anybody who was doing work in description logic always looked around for a way to test their work. So almost everybody who did a thesis in the 80s and the 90s – and I think they’re still doing it – tests on some version of the wines and foods ontology. And then later, when I was very active in the World Wide Web Consortium’s standardization effort to make recommendations for languages for the web for encoding meaning, we also wrote a guide to how to use the language, and we used a version of my wines and foods ontology.
Noshir Contractor: That’s a great story. I recall you at web science summer schools and web science workshops for the Web Science Conference talking about these issues and getting excited about it. You mentioned that one of the things that the World Wide Web Consortium has tried to focus on is creating these standards. And the example you gave early on in terms of markup language for things like, you know, whether you want something in red or whether you want a particular kind of font – those kinds of markup languages are extremely standardized, I would argue, around the world. How do you assess the extent to which ontologies have been standardized and embraced and adopted on the web?
Deborah McGuinness: You know, that’s a really interesting question. To get a very detailed, precise description that you can really make critical decisions based on, like how you should treat somebody in a healthcare setting, for example, you really better have somebody who understands the domain – so in this case medicine – very well, and understands what the language that you’re going to encode that meaning in is capable of doing, and then further understands what the reasoning systems that are going to use that encoding can do with it. And that’s a couple of skills that not everybody has put together. So the ontologies see great success when people really understand what they can do. And then they start to see some disillusionment, when people understand how hard it is to get them really well done and very precise. So they’ve taken off, and they’ve kind of gone through the Gartner trough of disillusionment maybe a couple of times. And the reason, I believe, they’re on the upswing, again, is because as, you know, the world knows, machine learning has exploded, and the datasets are getting larger and more accessible. The machine learning community and the extraction community and the embedding community is starting to realize that if they get a little bit of semantics. They can start to tell their algorithms how to use the meaning and get even better results.
Noshir Contractor: So most people have heard of things like tagging on the web, and in a sense, tagging is a form of ontology, but it’s a crowdsourced form of ontology, so it doesn’t have some of the rigor that you’re talking about.
Deborah McGuinness: Yes, so you can see efforts like ConceptNet. You know, in the very early days, MIT just said, “put a bunch of sentences together.” And so those sentences have words in them. And you didn’t have the connections between them, and you didn’t give people information about how to do it. But if you have even small synonyms, like automobile, or auto and car, we might call them synonymous, then you can make that link. You can start with just simple bits or small amounts of semantics from just making relationships between synonymous terms. But then you can also start to make more sophisticated relationships, like wines might have a color associated with them, and they might be made from a particular type of grape. And then, over time, when we’re trying to make more sophisticated recommendations, such as, say health advisors, you might start to have information about when your blood work is out of a range, say for a glucose measurement, which is related to diabetes. You might want to target an intervention with a drug that can help to get your blood work back into the right range.
Noshir Contractor: You have actually been working for a long time, specifically, applying semantic web concepts in the area of health. Tell us a little bit about where things started in that area and where you think there is potential for further advancement in terms of health web science.
Deborah McGuinness: That’s really, I think, an up-and-coming area. One of my large projects right now is from the National Institute of Environmental Health Science. And it’s to create a data portal. And I’m in charge of the data science piece of that, where basically, I need to come up with the ontology or the terminology that allows us to integrate data. And in this case, it’s about exposure, like whether your mother might have been exposed to heavy metals at a time during your development, where that was not good for you. So it captures information about exposure and health outcomes. So that in itself, I think is critically important, because we can collect data, we can integrate it in the way that you could pool the data together and do studies on larger numbers of people, which might let you have more confidence in the outcome of any statistical correlation that you’re seeing. You’re only going to be able to do that integration and harmonization if you understand what terms mean. You typically get some data that comes with a data dictionary. So when I see education or ED1, that means that the mother went to junior high school as her highest level of education, which allows us to figure out whether I’ve got studies whose highest education level was college or beyond. And that lets me pull data together and look at more studies that might be compatible to put together. So that’s kind of step one. But then, the next step, the one that I think I’m even more excited about, is personalized health, and, you know, precision medicine, and where I can enable people to help themselves. I want to help everybody in some patient’s ecosystem. So I want to help the person to make wiser choices when they’re not going to their doctor. And I want to help the medical professional make suggestions that are aware of a person’s individual status. So if I’ve got some blood work and one of those numbers is out of range, we can see whether there’s some intervention that might be amenable to me, that I might be able to make a small lifestyle change before I start to make a medication change. But I think a future is something like a personal health knowledge graph. So a graph has nodes and arcs.
Noshir Contractor: What would be an example of that?
Deborah McGuinness: So I might have a personal knowledge graph about Deborah. So Deborah’s a node. And then she’s probably got a lot of arcs coming out. One of the arcs might be demographic information. We might have information about my age. We might have information about my location. And so all of those are going to be arcs that are going to have values. But then you might also feed in the information from the monitor that I wear on my wrist that captures my motion, my steps, and actually also my sleep score. And you might have information from my smart scale, for what I weighed this morning. And then you might actually also be able to track that over time.
Noshir Contractor: Okay, so we’ve got the knowledge graph, I have an idea. How does this then translate into helping you leverage this personal health knowledge graph that you just described?
Deborah McGuinness: Yeah, so I want my personal health knowledge graph appear to be locally available, you know, through probably my smartphone. And maybe I’m allowing it to give alerts. So I could actually also let it give me alerts, when I’m near a healthy venue, when it’s close to the time that I might eat when I’m away from home. I’m not aware of anything that does that today. But there’s probably some startup doing it somewhere.
Noshir Contractor: And so one of the ways it knows whether a place is healthy or not, is because they are using an ontology system, where they label themselves as “I am healthy.” Is that how this would work?
Deborah McGuinness: Another way of doing it is labeling the site with your menu items. A lot of sites these days have some kind of nutritional information about the things that they’re serving. So you could have a query that says, “Does this restaurant expose that it’s got items for sale that fit particular characteristics?” So let’s say under a certain number of grams of carbohydrates, maybe that have ethnic aspects, you know, maybe I want Indian food that meets those characteristics or something. So it’s not just that the restaurant says, “I put a label of ‘healthy’ on my restaurant,” but they expose information that lets the smart query ask the right kind of questions that are personalized to me.
Noshir Contractor: You mentioned a few minutes ago that you haven’t seen many of these applications out there. Why do you think we haven’t seen it? And why do you think now is the time that perhaps a startup is working in this area?
Deborah McGuinness: That’s a very good question. I think we’re poised to do that now, maybe better than, say, 10 years ago. There’s way more open data all over the world. I think it’s more common today that restaurants have this kind of information. And there’s also potentially more awareness. There’s more awareness that being overweight or metabolically unhealthy is a tremendous risk. It’s a tremendous risk for a lot of diseases. But it’s definitely a risk for COVID.
Noshir Contractor: So you’ve given us a lot of food and drink for thought in talking about how web science is contributing to our health. You co-authored a book on this back in 2014. And as you look at that now, seven years moving forward, where do you see us going in the next few years in terms of health web science?
Deborah McGuinness: You know, I think we might have been a little bit early on health web science then. I think there was less acceptance in the medical community. I think these days, medical professionals, they have too big of a workload; they don’t have enough time. I think we’re starting to see a lot of apps or services that they’re beginning to trust because those apps or services are vetted, and they’re showing with evidence basis that they’re making good recommendations that the doctors can at least somewhat rely on. You don’t want your app to be taking over what your doctor did for you all the time. But you want that to be helping. And then at the same time, I think more and more we’re seeing the regular Joe looking for tools and applications that can help them lead a healthier, high-quality life. I don’t want to rely on having to go to my medical doctor every week, because, you know, nobody can afford that time or money wise. I want to be able to have tools that help me to do that in my day-to-day life. And you’re also seeing a push from technologists who realize that we’ve got a lot of foundational hardware, a lot of foundational data, what appears to be unlimited compute power. So the time is kind of ripe for these applications to take hold.
Noshir Contractor: Which brings us back full circle to the idea of web science being so interdisciplinary, because this is a classic example, as you’ve described it, of people like yourselves who come from a background in computer science and web science having to work very closely and gain the trust of, in this case, the professionals in the health community, as well as the laypeople, the general public. And until you have that sort of connections and trust amongst these various stakeholders, you’re not going to see health web science reach critical mass.
Deborah McGuinness: Yes, that’s exactly right.
Noshir Contractor: And I can imagine that many doctors might be initially threatened or suspicious of these technologies, because, for example, there is quite a lot of chatter these days about whether the notion of you going for an annual physical checkup is somewhat antiquated. Why go once a year when you have all these health monitoring devices that are monitoring a lot of your vital statistics 24/7?
Deborah McGuinness: Well, I don’t think any of these tools are going to replace the need for a medically trained professional, I think they’re just going to augment the professional. I don’t think there’s really any replacement for a truly caring, trained medical professional seeing you at least now and then, and certainly helping in a time of crisis.
Noshir Contractor: I’m sure you have reassured many physicians who might be listening in on this podcast. Again, Deb, thank you so much for taking time to give us a lot of insight about how much more we can do in the area of health web science than maybe a few years ago when we were fascinated by websites like WebMD and so on. There’s so much more that we could be doing, and you have certainly been one of the thought leaders and visionaries in this area. And thank you again for taking time to talk with us about some of where you see health web science going.
Deborah McGuinness: And thank you very much for your insightful questions. And I look forward to continuing this discussion on the web and off.