ChilCast: Healthcare Tech Talks
ChilCast: Healthcare Tech Talks
Getting Real with Healthcare Data
On this episode of the Chilcast, John Moore III, Managing Partner of Chilmark Research, sits with Joshua McHale, a consulting analyst at Chilmark who was the lead author for the recent report, Getting Real with Healthcare Data: RWD/RWE Enabling New Models of Research and Care Delivery. They discuss the importance of RWD/RWE in healthcare delivery and research today, notable findings from the report, and advice for folks starting or getting deeper into their RWD strategy.
Questions? Comments? Feel free to reach out at podcasts@chilmarkresearch.com. We'd love to hear from you.
Audio editing and production handled by Colin Kingfisher.
John: [00:00:14] My name is John Moore and I'll be the host of the podcast. Today I am sitting here with Joshua McHale, who is the lead author on our latest report, Getting Real with Health Care Data and Enabling New Models of Research and Care Delivery. Joshua is a consulting analyst with us here at Chilmark. But I'll let him give a brief introduction about what his work has encompassed in the past and why he was a great candidate to help us develop this research.
Josh: [00:00:41] Excellent. Thank you, John. You know, my name is Josh McHale. I've worked across the health care industry for the last couple of decades, working for providers, consulting firms, you know, always having a research data, analytic bent to my work. And I've used data from all different sources, all different types to help businesses run their business better and answer questions. So I've used everything and looked at it all different ways from all different sides.
John: [00:01:09] And you're also a big believer in the opportunity for this type of data to really revolutionize care. I remember the quote that we used in our press release was from a blog post that you had written where it really dove into the fact that in order for this to work, we really got to see more cross collaboration among the different stakeholders in health care to standardize this data. But when you think about some of the opportunities here, what is getting you excited about this? Why is it something that you've dedicated your career to doing at this point?
Josh: [00:01:37] Sure. So when I started out my career, there was very little publicly available data for researchers to use, and it was sort of the beginning of EHRs and claims data being more widely available. And so as my career has progressed, I've seen what can really be done when all of this data is integrated together and sort of used not just as an individual snapshot in time, but, you know, the longitudinal data and following patients over time and creating this patient journey across time and really tracking how individuals do across the health care system.
John: [00:02:15] So in your own personal experience, what have you found to be particularly relevant and valuable data sets that you've worked with?
Josh: [00:02:24] So for me, really the valuable data sets and the ones that are relevant are those that do have the ability to track patients over time and not just claims data, but integrating in that, you know, eligibility data and all of that, that health care experience across the industry. So it's really a combination of data and outcomes of procedures and diagnoses mixed in with that claims data. It's hard to do. Patients go in and out of health systems in and out of insurers. So yes, in and out of networks and yeah, and because of the way we've structured healthcare and claims, it's really hard to see people as they move across time.
John: [00:03:11] Okay. So that's a perfect segue into taking a step back and getting a little bit more into the definition and the history of real world data and evidence. So just for those of us out in the community that may not be as familiar with exactly what we're talking about here, could you just define real world data and real world evidence and what that means today?
Josh: [00:03:32] Sure. So real world data is any any data that we generate through the health care system, right? So anytime a claim is processed about through your health insurer or any time data is entered into an EMR, you know, or you participate in a clinical trial and someone collects data about you and your health care experience, that's sort of the overarching definition of sort of real world data, like your actual experience in the health care system. And then real world evidence is sort of any any way you look at that real world data. So there's any analytic work generating evidence on outcomes, data, you know, outcomes of procedures or outcomes of clinical trials, sort of any, any evidence that's generated using a group of real world data is, is what we're talking about.
John: [00:04:23] Real world evidence doesn't exist without real world data. And real world data is what you need to run the research and develop your your evidence based practices essentially, or evidence based decision making. Okay. So how about a little bit of background? Where did this all begin? I know that, you know, historically it was kind of around the claims data and clinical trials data. So how did that first go from being individual assets that were then used to do research with more broadly to where we are today?
Josh: [00:04:53] Yeah, absolutely. So that is where it started. Those were sort of the two big buckets where the clinical trials collected a lot of data sort of and as did hospitals. And that shift to EMR. And in the beginning, you know, there was very small EMR uptake, clinical trials sort of led the way and. Electronic capture of data. And it was there's limited data that you can look at through clinical trials and they're expensive to run. So as a way of getting more power to those clinical trials and looking at more individuals, you know, people started to link together that EMR data, especially as I mentioned, EMR started to explode within the last two decades, most hospitals being on an EMR now, you know, and really marrying those two data sets up. But then once you start going beyond that and looking at, you know, patient linking across other data sets, whether they're patient, patient registry data disease registries or, you know, other extraneous non-health care related data sets, You know, a lot of companies now are looking at how to link up other economic data and, you know, social demographic data as well to get a full picture of an individual.
John: [00:06:10] So you're just saying that one of the big things is around data aggregation and trying to actually collect all these records. And that's one of the big things that have done. But we still I mean, obviously there's no patient identifier. We're still dealing with a lot of fractured data and a lot of issues around that. So what has changed in recent years that has tried to reconcile that and also adds some a bit more robustness to these different data assets out there?
Josh: [00:06:34] Sure. So in the last several years, the data science is really in looked to solve the problem of patient matching and being able to match a patient across disparate data sets. And you know, you've got an EMR from one hospital, an EMR from another hospital, and they may have different patient identifiers. And so the increase in data science work to bring those two together and say even though we don't know 100% that this patient is equal to this patient, you know, there's a reasonable match that we know these two patients are the same without patient identifiers, you know, without one underlying linking variable across the data sets, you can still say with reasonable certainty that patient A over here is equal to patient A over here, and that they're one and the same and therefore you get a more complete picture.
John: [00:07:29] So there's there's some algorithmic stuff going on. There's you know, how much of a role is AI and OCR playing in that as well because I know that we mentioned NLP, AI and OCR is all different kind of algorithmic based tools that are have some intelligence to them that can help us better collect patients, but also better take advantage of what's in unstructured data. So how have some of these advancements in technology helped with both freeing up some data as well as reconciling the the records across different individuals?
Josh: [00:07:59] Yeah. So I think AI and generating those algorithms and matching the patients and learning how to match the patients is sort of starting to become an area of focus for some companies. And, and how can you do that without teaching a computer to do it instead of just doing it? As far as the NLP, that's sort of the next area where there's, you know, a lot of room for improvement and and taking free text out of EMR and transitioning them into these larger data sets as structured data fields that we can then analyze. The vast majority of information from EMR is in an unstructured note field that is is really hard to pull usable data out of. So a lot of time and effort is going into how to do that, and there aren't many great ways to do that. Now. It hasn't been solved yet. Right? So the problem of, of of notes and free text is one where sort of there's a lot of room for improvement in how information is taken out of the free text field.
John: [00:08:59] Yes. Yes. So that's one of the big areas right now that's still an opportunity for incumbents to develop something new or for some new player to come into the space and really help shake things up and add some new value. I know that obviously with the LMS being able to kind of predict the value of what's trying to be said or tease out what is actually written by predicting what words should come together and what words make sense given somebody's script. Obviously that's going to be helpful, but there's still no actual intelligence to the interpretation level of that. So you still need an interpretation layer that goes on top of just that reconciliation and bringing everything together. So hopefully we can see some of that in the next couple of years. I think that there is obviously with all this hype around the LMS and the new AI tools that everybody and their mom is trying to find new use cases for, I think we should see something that helps with this unstructured data problem and obviously other initiatives like Tefca and us, CDI, that's helping to create new standards for different types of data will also be really helpful because some of these data sets that should be structured historically haven't actually been standardized in any real way. So hopefully both of those things really helps a lot with freeing up some of this longitudinal data and the real world data that we're talking about. So you've mentioned longitudinal records a couple of times, and I've kind of hinted at this, but how close are we actually to achieving that goal of having complete longitudinal records for research purposes?
Josh: [00:10:28] So we are a ways away Still, one of the things that is the hardest thing to solve for and again, it comes down to that, you know, as as patients move across the health care system and transition between providers and insurance companies, it's, you know, the more someone goes to one health care system in one, you know, insurer, it's it's easier to do. And but as you have a population that moves more frequently now, it's sort of getting harder and harder to do. And do you have companies out there that may claim to have, you know, hundreds of millions of lives in their data sets, but their actual amount of time they have for each one of those individuals varies greatly from they may have one record for someone to, you know, they may have decades for another individual. And so it gets really hard to do that longitudinal data over time.
John: [00:11:25] Yeah. For sure. It's I know that that's been one of the biggest issues around this, the biggest sticking points. You know, when you have claims data, it's a lot easier because it's all one data set and it's just looking at the specific billables, you see some outcomes, but you're missing a lot of the actual context around those treatment plans and the outcomes and what the initial presentation was and things like that. So obviously the most insights come from understanding how people's lives have led to some of these conditions, especially the the lifestyle based health problems and conditions. And so better understanding, cardiovascular disease, diabetes development, how do genetics factor into that versus lifestyle choices? I think that's going to be really interesting long term and also the ability to kind of learn that, hey, you can you can reverse diabetes if you're a type two diabetic and you are able to make the lifestyle changes. So understanding things like that, you know, we're not going to learn those insights until we get this massive population level longitudinal data that we can mine.
Josh: [00:12:28] Yeah, those, those those lifestyle changes aren't reflected in claims data and they may not even be reflected in the EMR data. You know, it's something that it's hard, that it's lifestyle changes and social determinant changes are things that are going to be hard to quantify. Um, you know, I, I became a diabetic and it's caught in the data and then a year later I'm not and it's hard to see what may have led to those changes. Right? Like.
John: [00:12:59] Yeah. I mean, we don't have we don't have real insights into that. We're starting to get some of that from patient report outcomes from some of these care management platforms that people have developed over the last decade. But we're still in the early days of that and actually seeing how those all come together and are aggregated in a way that can be mined for these types of insights into population health. So how much does R.w.d. Factor into kind of modern initiatives around cost containment and utilization management and personalized medicine and evidence based care and all those like modern initiatives that we're seeing? Let's start with cost containment. How does it factor into value based care initiatives?
Josh: [00:13:37] So it makes it a lot easier for companies now to track their costs over time. You know, these these large data sets, especially with the claims data, you can get probably most accurate and sort of looking at patient cost because these large claim data sets have most, if not all, of the claims for an individual. So quantifying their total cost of care over time is easier. And it's probably one of the easier things to do with real world data.
John: [00:14:05] That's assuming that hospitals and care organizations are actually tracking their own expenses properly so that they can see what their costs, their costs of care are to provide that care. That's something that we've been talking about here at Chilmark for a number of years, and it just doesn't seem like a there's enough demand from the buyer set on the provider side. But also it's just one of those complicated things that I guess vendors haven't really tried to tackle too much. I know that Health Catalyst a number of years ago was going after the total cost of care piece and really understanding, you know, what are the incremental costs around each unit of care provided versus what the billable is at the other side.
Josh: [00:14:39] Hospitals and providers have a hard time quantifying how much they spend on patient care. They know how much they receive in billing. But if you ask them how much they spent to provide that care, it's very hard for them to quantify that. So you have you have one side of the equation how much they're getting for that unit of care. But understanding how much went into providing that unit of care is very, very difficult to tease out.
John: [00:15:05] Okay. So as we think about the shift to value based care, the shift away from fee for service, obviously benchmarking is another area where some of these platforms are able to aggregate real world data from other organizations near you so they can help with your benchmarking piece and understanding where you're going to fit in your stars ratings relative to your regional competition or your regional other players. So if somebody was to if they don't already have a real world data strategy, where would you suggest somebody starts developing that? How do they actually start thinking about what they need?
Josh: [00:15:38] Yeah, absolutely. So when you start thinking about what you need in terms of data, it starts with what you are trying to accomplish with the data. And I know that sounds obvious, but it's not as easy as you know. It's not as easy as you think. It's not as easy as someone thinks because every data set that's out there and available for purchase or for examining has a different level of information in it. And so you may be looking at comparing yourself to your competitors in the field and doing competitive intelligence with the real world data set. But you're not able to tease out from that data, set the information you're looking for. So it really starts with what information you want to get out of the data and then using that as your basis for looking for who can answer that question for you, how well they can answer that question for you.
John: [00:16:29] So without getting into like specific vendors, given your own experiences with Southcoast or with 32, like where do you see the the initial value really coming from implementing this? I mean, everybody that is at a fundamental level, if you're doing value based care, you're looking at your performance metrics, you're looking at your outcomes and your length of stay, your 90 day readmits and then you're also just looking at your general performance. So what do you think? And utilization management. So what do you think? What data sets are good for getting into those areas and trying to evaluate how to route patients, how to make operational changes that can improve your bottom line and things like that?
Josh: [00:17:15] Sure. So those are all a lot of those are going to be sort of. The datasets. The claims datasets that feed into the larger dataset overall. What insurers does the dataset capture and how much information does come through again? The companies that sort of put together a dataset on top can layer a dataset on top of the claims, sort of give you that much broader picture, especially when it comes to outcomes. But you really want to start with a vendor, you know, things you pointed out for utilization management or cost of care like that really comes from a dataset that has a robust linked claims database, you know, somewhere where they can really track people over time across insurers and, and get you that sort of total picture for utilization management and cost of care and that type of stuff.
John: [00:18:08] Do you think that having a real world data strategy is, you know, taking looking at value based care as well as fee for service? Do you think that having an actual strategy around real world data is critical to contract negotiations these days? If you're a provider going out to structure your contracts with a payer?
Josh: [00:18:25] Absolutely, Yes. I think it's business critical and knowing where you stand, especially if you can compare yourself to others and whether it's on the provider side or the payer side and saying, you know, this is where I am in the market and I know that I'm getting paid more over here or getting paid more over there or, you know, I provide better outcomes and therefore doing a higher value of care and should share in the savings that I'm generating, like knowing where you are in the market and how to use that when when talking to either payers or providers, depending on which side of the equation you're on.
John: [00:19:01] So you mentioned earlier and how that is a big missing gap in a lot of these things and a lot of these data sets. So obviously health care organizations are now mandated to start asking about social determinants of health. I was just at at Epic last week. And, you know, one of the things that I heard a provider kind of complaining about with that mandate is that right now there is no clarity around what hospitals are supposed to be doing to address that. Like a they aren't required to be. It is a bit more of a social and more of a political issue than it is necessarily a responsibility of the health system to be providing these community based resources or pairing people with them. I think it's great that organizations are trying to, but if an organization doesn't already have a connection with a community partner, like a Find Help or a Unite Us or one of those organizations that's pulling in these external community based resources, where do you think like the responsibility to actually track that for these real world data and evidence tools? How does that factor into this? Because, you know, anybody in health care has seen the stats where it's about 80% of care is determined by what happens outside of a care organization, outside of the care continuum. So obviously, you know, factoring in people's lifestyle choices, like we're saying earlier, is very important to really understanding the full condition and the full context of a person's health status. So so with the uncertainty around the future of addressing still kind of being determined, what should organizations be thinking about now to try to collect this data and make use of these other insights that they now, you know, can track to a certain extent?
Josh: [00:20:39] That's a great question. So, you know, I think it's vital to understand the problem. The data does need to be collected, right? So hospitals and providers and insurers should all know their populations and what is going on, not just within the four walls of the hospital or in the physician's office. And it really comes down to initiatives on on both sides to look at how you address those. And the hospital may not be the best place, right? The physician office may not be the best place. It may be on the payer side where understanding, you know, and if you look at some of the Medicare Advantage offerings out there that can offer more above and beyond what your normal Medicare fee for service can, Right? Like there are Medicare Advantage programs that are offering other wellness incentives to join their Medicare Advantage and sort of taking that on the payer side and pushing that further and looking at how on the payer side, you can incentivize people to join my plan and will give you health care memberships and provide you with these external third party resources to address those social needs. But it comes back to I think both payers and providers have that responsibility to understand and collect that data, but it may not ultimately be up to the hospitals or the providers to address it. It's sort of you need that other outside source to push change because it's not a it's not a medical problem. It's a social problem.
John: [00:22:06] Exactly. And I mean, this is one of those areas where people are trying to fix it with a technology. And it's like the technology is really here to collect the fact that this is a problem and document. It's a system of record and then it can. System of engagement where its then passing you off to the right people. But most of these things, it's more of a human driven factor than a technology driven one other than just filtering up what resources would be best for which person based on their need. So you mentioned the kind of the plans and just tracking the data to understand your patient population. And that reminded me that one of the other big drivers of some of the advancements over the last couple of years has been the public health emergency and the need for public health surveillance. So can you talk a little bit about how the Covid emergency and the pandemic lit a bit of a fire under the federal and kind of national initiatives around aggregating population health data?
Josh: [00:22:59] Yeah, absolutely. So it was so I want to preface this by saying I have a public health background and so I am all for anything that pushes us towards a more a wider understanding of the health care system and how it works together, because it's not just the medical side of it. You know, there's other factors outside that that affect our health, as we've talked about social determinants of health. And so it's been discussed for a long time in the health care field, like, you know, how to get this wider use of data and collecting more data, not just help specific data. And it was really under the public health emergency that a lot of health care leaders came to realize that it's not just claims data and and understanding that that drives how we do health care. A lot of leaders now really understand that it's sharing of that data and and working together and less siloed and having that inter interoperability of data. And also that sort of the sharing of the data sets can give you a more complete picture of what's going on and where there may be issues, you know, in terms of Covid, right, like a high Covid rate over here because we were able to share data within a state or across state lines. And it's able to adjust resources to go meet that demand. And so I think a lot of health care leaders saw that play out in in the public health emergency and want to have that within their own institutions or that granular level of data within their own day to day working lives.
John: [00:24:38] Yeah. So one of the I mean, that's great, great answer. And one of the things that I think you may have lost over a little bit, though, is how hard that was. So what like how did we fix that? How do we actually as a as a country, how did they start collecting this data? Because obviously there was no well configured national surveillance platform. The CDC has been lagging on this for quite some time. It's been an area of high criticism of them since the pandemic. So what did we actually do? What were some of the levers that were pulled and dials that were turned to kind of get this data flowing in a more effective way? I know that epic specifically, you know, they like to point to Cosmos and their epic health research network as being one of the first places that people could actually do, like population level analytics around the diagnosis conditions because they had one of the largest data sets that was actively tracking conditions, actively tracking cases, but that was just one EMR. And while Epic may be the most installed in the US, it still doesn't make up everybody's health status and it typically falls into larger health systems and smaller ones. So how did we actually collect that information from some of the, you know, less advantaged care areas?
Josh: [00:25:48] I think it moved a lot of organizations that lagged behind into understanding and these more data sharing agreements. Right. There was third party organizations that were willing to jump on the problem and help figure it out. And money. A lot of the issues around data sharing is that if you don't know what you're doing, hiring the right resources can be expensive. And I think one of the large parts of the public health emergency was the money that came into the health care system to not only treat patients, but I think it also helped on the back end, bring organizations up to speed on how to data share what they need, how to do it and what the value they can get out of it. It was maybe something they talked about for a while, but the money finally came from the public health emergency and helped push them in that direction.
John: [00:26:39] Well, it became it became another business requirement during that period because everybody was struggling just to stay on top of it and have a better understanding of how it was affecting their their region specifically because it was, you know, isolated pockets as well as a national issue. So everybody wanted that localized data as well as just understanding what the national trends were looking like. Yeah, I mean, so that's a great response. Love it. And I do. I also wanted to call out that we also saw real world data being used for investigational purposes, and we got to see the science playing out in real time because we were getting more of this data coming in to provide evidence around what was working, what wasn't, how, how was this being spread? You know, the whole masking thing initially. So they said that we didn't need to use masks and they were like, Oh wait, just kidding. You should probably wear masks. So like that was largely because of the evolving real world evidence based on the data that was being collected. And so I think that was also really interesting because it exposed to a lot more of the nation Just what is possible when you do get these organizations working together on a national level, but also what some of the problems are as far as getting a better picture around what does an aggregate perception of this whole disease and pandemic look like? Because we do have all these silos and it really showed to people that were outside of healthcare IT and outside of healthcare just how fractured this is.
John: [00:27:57] I mean, everybody I think, has a sense of how fractured the data piece is because, you know, every time you have to go to a new care site or even a care site within the same hospital, you have to fill out the forms again so people have a sense of it. But I think this really drove home just how disconnected the surveillance piece is. And obviously, you know, being a public health person yourself, you know that this is something that public health people have been talking about for decades now. But it's only just now happening.
Josh: [00:28:22] To add on to that. I was for the first ten, 15 years of my career, I used data and did research with it and analyzed data as an analyst. And even working with it on a day to day basis did not understand the fractured nature of it and how bad it actually was. And until I moved to one of the health systems I worked for and I was there during an epic implementation and really got to see all of the not just epic and what it is, but like all of the fractured nature and how claims come in and move through the organization and how EMR data moves through the organization. It was really eye opening, even as someone who had worked in the field for 15 years to see it and really get a sense of how fractured data actually is. And that's just within one health system in one area of the country.
John: [00:29:13] Yeah, not even a national health system. It's been made through a series of acquisitions of different organizations.
Josh: [00:29:19] Exactly like, you know, just. The monumental task of pulling together three hospitals, putting them all on Epic when they all had legacy EMR systems and then legacy payment arrangements with the various numerous providers in the state as well as national payers who would come through on occasion. You know, it was quite eye opening. So yeah.
John: [00:29:44] Yeah. I mean. Three hospitals isn't even that bad.
Josh: [00:29:47] Yeah. And that's to get back to your point, I think that's what the public health emergency really sort of surfaced for a much wider population of the US was, you know, the demand for knowing and understanding what was going on and there just being a lack of that information out there to get a complete picture or a more complete picture than what we were getting early on in the pandemic.
John: [00:30:09] So to that point, too, I mean, one of the other outcomes of that was there was a lot more exposure and kind of public awareness around the inequity issues because not only were we able to see in real time that certain disenfranchized or disadvantaged communities were getting hit harder by the pandemic, partly because they had to work, you know, essential jobs at minimum wage or just above minimum wage. But we got to expose a lot of the inequities of care. And so how do you think that real world data going forward is going to help with the inequality issues that are such a big focus of health systems and the federal government today?
Speaker4: [00:30:46] It's going to be hard to do because, again, it's one of those things that we just at the patient level, it's hard to to quantify those social determinants of health. Right. We have some. Proxy measures out there where we can say, you know, because of where you live or the job that you do and some other sort of measures that you fall into certain possible social determinants of health. But there's addressing those. I think it's going to show in the years to come that it's hard to get that data, as we've spoken about, like understanding what that means. And, you know, every individual has a thousand different social determinants that go into their day to day health and their their long term health that it gets hard to quantify easily in an EMR or in a claims data set. So so I think we're in the beginning stages of using these real world data sets to find equity and, you know, finding that inequity in the system to address it. It's again, it's hard to tease out what each individual social determinant of health means for someone's long term health. And, you know, even a lot of times we use sort of general geographic zip code level data or county level data. But even within that zip code or county like, there can be wide inequalities, inequities in the way health care is delivered and served. And some of that comes down to the way health health insurance is administered within the country. It sort of teasing out what what is really the bigger effect is, is hard to do. And it's the data sets we have, I'm not sure sort of or the data we have available to us is not really there yet to be able to tease out those differences.
John: [00:32:37] I think that's really another one of those opportunity areas where it's like, okay, what are some of the actual health relevant data sets that we don't have already? What are some things that may not have been historically considered relevant to health but really should be considered that like today, claims data, not just claims data, but like there's credit data. Credit data is used by health care organizations all the time. You know, I remember seeing Experian and TransUnion, you know, at the hymns or at other health care conferences. I thought that was so weird at the time. But like they do play a factor in eligibility and ability to pay. So what are some some new data sets that you think we're missing that we would really benefit society right now?
Josh: [00:33:17] So I think it's larger patient outcome, patient reported outcome data sets, including those in the the real world data that's already out there. So as I mentioned, a lot of data sets that are out there now are a combination of claims data and EMR data. So there's some of that information. But I think the the really understanding of patients sort of longitudinal experience through the health care system really comes down to at the next layer, adding in those patient reported outcomes, right? Those sort of the quality of life issues and adding those in there and understanding really what affects quality of life going forward. Know we can see sort of those larger, right? Like this treatment or this drug sort of prolong life by X number of years. But really getting down to what those that quality of life is in those extra years and understanding from the patient like, hey, this surgery really improved my quality of life. And, you know, I had better sort of life outcome than just the raw. I lived for X number of more years or I'm going to live for a number of more years, right? Like it's that it's that quality. Not getting to understand that quality of treatment and quality of the experience is sort of the next big step.
John: [00:34:34] So IBM made a big play in this area that's basically all of Watson Health was basically a real world data and evidence play. It was one of these early AI powered. They bought up a bunch of different companies that were specifically aggregating different data sets, but then they recently spun that out as narrative and, you know, it became its own organization. So without going too deep into doing a postmortem on why Watson Health ended up being kind of divided and branched and sold off, what do you think some of the vendors out there should really be looking for to continue to be successful in this space as their competition consolidates, as the field gets narrower, as more people kind of lean into their specific data sets or new entrants enter the market. So what should organizations that are already playing in this space be doing to or thinking about to stay ahead of the market and to really stay innovative?
Josh: [00:35:26] So I think one of the things I need to do is is know the limitations of what their data can and can't do and and not over promise something outside their data limitations. Right? Like we've talked about like there is no true longitudinal data yet for or data sets out there for patients and and saying that you have that already or you're developing that like it's a lot harder than it looks. And that's sort of as we come back to it where the IBM experiment. Watson experiment.
Josh: [00:35:59] It's not because of the fragmentation, the data fragmentation that we have in the US health care. It's not as easy as it sounds. And coming out swinging for the fences was sort of setting themselves up for failure. There's more of an incremental change that needs to happen slowly over time. So, so know your data, know what you're looking for, and then work in that singular focus going forward. And I'd like to say it sort of like staying in your lane, right, of like, know what you do and do it well.
John: [00:36:34] So building off of what you just said, people stay in their own lane. If people want to expand their market share, obviously they need to expand beyond their lane. They can perfect their one lane. But then as they think about adjacent areas to get into, you know, they should really be going after things that expand that lane and align well with their data set as opposed to having, you know, partitioned data sets that don't actually really align well. So if you look at the way that we broke up the market into a couple of different categories, how would you kind of think about clustering those functionalities? Or if you thought about doing a stepwise, you know, start with claims data that's kind of the cleanest, easiest to compute. Most robust tools are already out there for analyzing that. That's been around the longest kind of. What do you think about that expansion as far as the additional data sets that kind of build off of that in a stepwise function? Obviously, you said EMR, probably the next one. So the the clinical data is that next level past claims.
Josh: [00:37:28] So the next level past claims are, you know, as we've talked about, sort of folding in that EMR and actually seeing the patient experience within that health care system. And after that, you sort of really getting at those those patient quality or it's getting at those patient reported outcomes and expanding from there and not growing it outwards to start including the patient in their actual experience from just this strictly like financial look at it through the claims and through through the EMR experience or their health care system experience, right? You want to start bringing in the patient on a and what their experience is.
John: [00:38:07] So we've talked a lot about the Sdoh and the patient report outcomes, but we haven't talked at all about genetics data, which is another new data asset that's getting pulled into these real world data tools. And I know that there are some companies out there that are specifically focusing on this. Like Invitae, they recently acquired a company called Citizen, and the two of them are doing a real world data play and real world evidence largely aligned around clinical trials, recruitment for rare diseases. There's a company called Exact Biosciences that was at Epic. They're one of the partners with Epic doing genetics testing, doing snip marker testing and having that feed into Cosmos or what Epic is calling the Cosmos home, I guess. But they're pulling in genetic data through their new tool called Aura, which brings in labs. I've seen a couple of other organizations talking about factoring genetics data, but obviously there's snip data which is incomplete but focused on specific polymorphisms and where we know there are differences in DNA and then there's full genome sequencing. So where do you think the future of real world data is with regards to genetics information and these new testing tools that we have?
Josh: [00:39:19] Genetic testing has exploded in the last several years and there are now sort of large datasets out there and many companies developing new genetic tests. And while I think it could be instructive, informative to open enter that data into a data set and looking at over time, I'm not sure that we've thought through how to do that in the most appropriate manner. It will provide it in the future. It will provide much more complete picture of the individual. But I think the understanding of genetic testing and how it affects our our health and and the outcomes for the patients is still being examined and still not fully understood yet.
John: [00:40:02] Well, and also, if you're talking full genome sequencing, we have a whole issue of like, can you truly identify that also like, I mean, that is very if somebody has the right data sets, they can kind of hack into who you are with that. I mean, not that you can't do that with other data already, but with genetics data. I mean, that is at its core very personally identifiable health information, if we had the tools to do that.
Josh: [00:40:23] Yes. And that's one of my hesitancies here is, is it's getting into some very gray areas in terms of, you know, de-identified data. And and keeping a lot of these data sets are anonymized. And in theory, you and I don't know who we're looking at as we would comb through this data and you're starting to get as you get these larger data sets, keeping it truly de-identified and private gets harder and harder.
John: [00:40:54] You know, you need reams of this data like the volume that we're talking. Here is pretty astronomical. It puts a lot of other industries to shame, honestly, with how much data is being produced in health care and being analyzed. So how do you actually assign a value to an individual's contribution to that pool? I feel like we're going to start seeing more people advocating for getting compensated for their data being used, but like, how do we actually do that as a society? How do we approach, how do we think about what each of our contributions are? Is it you get a percent of a penny every time your data is queried? Like, What do you think of the future that might look like?
Josh: [00:41:34] Well, that's a great question and a great idea. And it is. You know, the privacy laws and a lot of this work only revolves around ultimately those next steps we've talked about sort of only revolve around people giving up their information. Right? So we want to do patient reported outcomes, but you have to find a way to incentivize those individuals to give you their outcomes. And, you know, and it's all patient reported as a patient myself, what incentivizes me to give my information over, you know, I know there is some loss of privacy. However, there is a social benefit that I see in handing over my data. And if someone does not see that social benefit, there are sort of those things of like, how can we help you along the way of not handing over your data, but, but helping advance science and looking at how to improve health care for everyone.
John: [00:42:27] Okay. Last two questions. Were there any interesting findings from the research that you did for this report?
Josh: [00:42:35] One of the interesting findings I had was just many large players have come into this space, you know, either through acquisition of other companies or in just starting up their own business line within the company itself and sort of starting from scratch. It's I think the it's not necessarily because they see the utility in it, but I think it helps, you know, these companies attract new business and sort of grow their bottom line because it's something that people are clamoring for. Right. Like anyone with the ability to look at these real world data sets and help businesses parse through that information and develop usable information that can help a business, you know, their companies are willing to pay for it. So, you know, there's been large companies that acquired data assets as well as data data individuals with the skills to cut through the data. And then there have been some that have just sort of said we're going to jump into this field and started their own divisions. You start seeing it and it's companies that are non traditional health care companies deciding it's worth it for them to dive into the health care field.
John: [00:43:49] Give me an example of some that are doing this data play that's not coming from health care directly, because obviously a lot of the companies in the real world data are ones that have been collecting health care data already for a while. So what are some of the companies that aren't necessarily health care?
Josh: [00:44:00] Native Oracle is the big one I can think of. And their acquisition of Cerner and Visa, you know, they get not only huge data asset out of that from the server side, but they also get the wherewithal from the visa side of how to how to dig through that data. And visa actually does bring some of its own data assets to the table as well. So, you know, that's that's a prime example of a large company that wasn't necessarily in the real world data space for health care and, you know, sort of jumped in feet first and bought up a, you know, through acquisition, you know, got into the market.
John: [00:44:40] Yeah. Decided that they were going to go after the health care vertical more explicitly than they have in the past. Yeah. And the other one that comes to mind for me is Thermo, which, you know, Thermo fisher traditionally just they supply supplies, you know, their lab supply company first and foremost. I used to order from Thermo all the time. They're the ones I would get the HeLa cells from. I would get media from them and they're just a massive purchasing distributor. More than anything. But they have recently decided to get into the clinical research and clinical trials side. So they bought a company, PPD, that runs clinical trials and they recently bought Corvus to bring into that PPD unit in order to have a real world data asset to go along with their CRO offering. So yeah, you're seeing a lot of companies that may be adjacent or, you know, running in parallel with health care in some capacity, but, you know, not having as much of a clinical play really getting into the space too. And I think that we're going to continue seeing that as organizations that have lateral business units that can benefit from health care data, start acquiring smaller data plays and working that into their core offering. Okay. So thinking about the next few years, what are you most excited about that will be enabled by these data assets and technologies?
Josh: [00:45:47] So I'm excited to it's more along the lines of personal interest, but I think a lot of companies will start using these data assets to redesign payment models for how. Health care is delivered. Now that we can see total cost of care and or more complete cost of care for patients, really using that information to, you know, both the payers using it as well as the providers using it to say, hey, how can we actually address the cost of care? I think that's a large area. The coming years, as as both providers and payers get into this space as well. You know, traditionally a lot of this real world data, real world evidence has been used on the pharma side. But you know, you're seeing, like I said, payers and providers getting into the space and really starting to understand their own business and then how can they reduce the cost of care or attempt to.
John: [00:46:40] And that's what you're working on now, right? So if there was as you're looking at the data sets that you have access to, what would be really useful to you achieving that vision that either doesn't exist today or that you just don't have access to?
Josh: [00:46:53] It goes back to something we were talking about earlier and that social determinants of health, right, Really getting at that and understanding where these patients are coming from and what affects them on a day to day basis, because they're not they don't have contact with the health care system generally. They don't have contact with the health care system on a day to day basis. Right. And so most of their time is spent in the community and really getting to understand how we can be in that community and change the things that are affecting their health and help them improve their health outside of the medical system.
John: [00:47:25] Yeah, I think that's fantastic place to end. I think I'm totally with you on that. I think that that is really the goal here is as as we've quoted you before, just getting care to be improved and developing more evidence based, outcomes based medicine. I think that's going to be a really exciting future that we can see on the horizon from all these data assets. It's just a matter of, to your point, getting in that extra data around how life decisions factor into all this and lifestyles factor into it, but also that cleaning of the data and making sure that it is reconciled well between patient, each individual patient and making sure that we can build out those longitudinal records. I'm curious to see what happens with the hints and how much that actually does enable that type of a outcome with this data. But that's still very obviously unclear if we'll actually be able to create proper longitudinal records with that initiative or not. You know, we're going to be working together on a buyer's guide to complement this market Trends report that we just did, and that will have a little bit more pointed guidance and feedback on selecting a solution and what to look for as you're developing your own strategy as a stakeholder in health care, whether that be a provider, a payer or pharma. You know, the three major stakeholders that are utilizing real world data now will be interested in that buyer's guide once it's developed. So I'm looking forward to building that out with you and seeing some demos of this technology because I haven't seen a lot of demos. I've seen a couple of them so far, but I'm very curious to see exactly how different all of these solutions really are and, you know, putting people through their paces to see what if their claims about what they can do with their data are actually true. So I'm looking forward to doing that with you.
Josh: [00:48:58] Absolutely. I'm looking forward to it as well.