How does it work? HCN3D Under the Covers Allen Hall Sep 17 2019 The goal of this article is to summarize how HCn3D uses Reinforcement Learning to identify potential outcomes without getting too technical by delving into machine learning functions and algorithms. We won’t be trying to make the reader a “data scientist” or AI expert. The best way to kick-off this discussion is to consider what we do as humans to learn, or more specifically anticipate what is going to happen next. In the moment, we observe what is happening around us, our “environment”, to make a decision and take “action”. We analyze the results of that action, “feedback”, to determine the new “state” of our environment. We repeat that process iteratively until the task is completed or goal realized. A simple illustration of the process could look like this… For HCn3D it is much like navigating a maze with a single point of entry and multiple points of exit where each exit represents a path to a potential outcome, each traversed in multiple ways. If you were to assign value to the number of turns, successive attempts to navigate the maze would result in different costs to find an exit. Those “costs” would then permit HCn3D to rank the possible or potential outcomes. A couple of things worth noting to avoid oversimplification: Traversing the maze changes it, because the environment changes, essentially each new turn or action results in a new maze. Value can be defined in ways other than monetary cost. For example longevity and pain management. Q: So just how does HCn3D use Reinforcement Learning to determine “value”, “How does it work”? It functions by analyzing the patient journey of “like patients” to determine potential outcomes for the subject. In essence the history of like patients is analyzed to determine the potential futures of the “subject patient” at the point of action we call “the moment” or more technically the “inflection point”. Conceptually multiple moments are sequenced together to analyze the past retrospectively for both the “like patients” and subject. The result is a patient trajectory looking forward, prospectively ranking the actions that could be taken to realize the desired future outcomes. That’s a nice concise answer, but what does it really mean? To explain we will refer back to our simple flow diagram above, expanding on it to incorporate the “cube” of HCn3D… The cube and the environment in which it resides is a holistic representation of the moment or point in the patient journey where a decision is made and action taken. For more information about the cube watch the video The Patient Journey Through the World of HCn3D. As you would expect the natural extension of a single cube are multiple cubes stacked together forming the patient journey. Whether the interval between moments is short (possibly just seconds to minutes) or long (weeks to months) preceding cubes become prior information or “state” for the current cube and its environment. Potential outcomes are based on the analysis of the journeys of other like patients. HCn3D calls them “Future Value Outcomes”. This is a form of counterfactual analysis, a fancy name, but conceptually it’s simply using other patients’ past journeys to construct a model that identifies potential future outcomes of the subject patient. Using a little machine learning analogic “magic” these other journeys, possibly millions of them, “teach” HCn3D how to respond in the next iteration, at the next inflection point or moment. Agent – Represents the provider (doctor, clinician, etc…) taking Action “in the moment” at the Inflection Point. When the analysis performed is retrospective (patient history) the Agent of course is a real provider having made a decision in the past that is now part of the record. When the analysis is prospective (potential patient outcomes) the Agent is the machine learning algorithm traversing the possible patient trajectories. Action – The decision made by the provider (doctor, clinician, etc…) “in the moment” at the Inflection Point. Once again when the analysis performed is retrospective (patient history) the Action occurred in the past and is now part of the record. When the analysis is prospective (potential patient outcomes) the Action is chosen by the machine learning algorithm while traversing the possible patient trajectories. Environment - Is holistic including: the patient record, Healthcare Network, related extrinsic data, and unknown causal information. Computationally it is where the Agent’s Actions are computed to formulate the “feedback”: State, Reward, and next iteration. Policy – Is the strategy employed by the Agent as it moves through the environment to determine its next Action. When the analysis performed is retrospective (patient history) the Agent is a real provider having made decisions based on knowledge and insight. When the analysis is prospective (potential patient outcomes), it is the result of the machine learning algorithm applying the counterfactually formulated model or analogy. Immediate vs. Deferred Rewards – As you might expect anything described as “reinforcement” would have some method to convey reward. Reinforcement Learning evaluates for immediate vs. deferred rewards and is used to influence outcomes. For instance an acute condition would necessarily require a bias towards more immediate Actions, where a chronic condition might favor deferred rewards. In Reinforcement Machine Learning, a “Discount Factor” is applied to adjust the reward, but we promised not to get too technical so we’ll stop there. State – Is simply the way the Agent sees the Environment in the moment, at the Inflection Point, the demarcation between the Prior Information of the past and potential outcomes of the future. Trajectory – Is the Patient Journey from moment to moment that result from the decisions made at each Inflection Point, a sequence of State/Action pairs. Value – Is reflected in the Action Spectrum at the end of a cube sequence, essentially a long-term return value as opposed to a short-term reward. Being analogous to the way humans think, Reinforcement Learning is aptly named. If you had the time to read and analyze thousand to millions of patient journeys you would identify similar outcomes. Of course we all do this inherently without thinking about the mechanics, but with a very limited set of data. Fortunately thanks to machine learning, and more specifically Healthcare in 3D, providers can accomplish in seconds what would take several lifetimes otherwise.