thank you very much Robin and I’d like to thank the organizers and the course directors for inviting me to speak to you today I hope to show you a little bit about what we’re doing in Canada and internationally with big data large data sets defining what they are and also looking at understanding the opportunities for IBD research using big data redefining translational research as it now exists from bench to bedside and we’ll talk about a new way of understanding translational research and demonstrating that epidemiology research can give clues in the search for environmental contributors to IBD pathogenesis so we all know the old definition of translational research where you take a finding that you make at the bench and you bring it to the bedside to patient care to improve the quality of life and the outcomes of patients I suggest that there’s a newer translational research that’s developed recently where not only do you have the bench to bedside approach but you also can take findings that you make at the bedside or at the bench and apply them to the health system in order to improve the quality of the care that’s being given to patients with IBD and finally apply them to the population in addition you can take findings from the health system or front from the population and then bring them back to the bench to try to understand the way the biology behind the findings that you’re making in the large data sets the population-based data sets and I’ll give you some examples I’ll show you some examples in a little bit of this cycle and at work so firstly let’s talk about what i mean by big data it’s a bit of a buzz word and i’m sure people are starting to get tired of that buzz word when they hear you know Google and Facebook and all of the different types of big data that are being applied but in our case here today mostly what I’m speaking on is big health data so health data in Canada is called health administrative data it’s collected on every person within a given province so there are 10 canadian provinces the healthcare system is publicly funded and every person who has a health card which is more than ninety-nine percent of the population is tracked and collected within these large health administrative data sets so you can follow people from birth until death if long as they remain within the province and you can assess whether they’ve used the health care system which physicians they’ve seen whether they’ve been hospitalized and more and more recently we’re starting to link those data to other data sets other population-based data sets to try to look at other outcomes and other factors within the overall Society of Canada and I’ll give some examples of that in a little while but if we define big data overall not just big health data but big data overall people talk about the 5 v’s so the first V is volume so obviously the the volume the size of the data base that it’s being collected and we’d be talking about millions or more individual level data velocity is the speed at which the data is collected and the speed at which it can be changed and this is one of the five es that the health administrative databases in Canada may or may not apply in terms of the definition of big data health administrative data in Canada usually takes about a year to get into the databases so you’re delayed in finding results so clearly if you’re looking at sort of the epidemiology of outbreaks of infectious diseases that’s not really practical using those sets of data that we have we have other sets of data that we can use but those data those health administrative databases we’re not going to be waiting a year to find out what’s going on with an infectious disease the variety of data that’s available the different forms of data the different types both in terms of the structure of the data and also where it’s coming from the veracity of the data that means the uncertainty of the data how messy is the data and you know obviously in terms of Health administrative data we’d be talking about very very messy data we can’t rely on physician billing information to know that the patient truly has the disease that the ICD code the international classification of disease code that’s associated with the physician billing you know does the patient truly have the disease or not so we we talk about validating those data before we use them so certainly the messiness of the data Canadian health administrative data would apply that that certainly that V the last V that’s not shown here is the value that you get for using the data and that’s a big one in Canada so clearly the idea of conducting prospective cohort studies or even prospective clinical cohort studies that’s a very expensive study design whereas health administrative data is being collected routinely by the healthcare system and is available for research now and so is fairly cheap and easy to access and we can answer very good information and develop very good questions in order to conduct research with those data so the value is really important in the Canadian healthcare system so when I’m talking about health administrative data and big data in

Ontario I speak right now about the IBD data set that we’ve got so we validated two different algorithms and algorithms means combinations of different codes so physician billing codes hospitalization codes procedure codes and medication codes in order to identify first children with IBD and to accurately identify children with IBD and more recently to identify adults and elderly patients with IBD so we were able to develop good validated algorithms to know that the patient’s truly have IBD and once we know that they have IBD we can follow them both before their diagnosis if they were in the province as well as after their diagnosis as long as they remain in the province and with that we’ve developed the Ontario Crohn’s and Colitis Court which is the largest largest ongoing surveillance cohort of IBD in the world and those algorithms apply from 94 to 2011 for pediatrics and 99 to 2011 for adults and elderly patients we describe the trends in incidence and prevalence over time of IBD in Ontario and found that there’s about sixty eight thousand people in our data set so 68,000 people living with IBD in Ontario right now and they were diagnosing about three thousand new cases per year of which about ten to fifteen percent are in pediatrics in under the age of 18 year olds and the crude prevalence of IBD has increased by about sixty percent between nineteen ninety four and twenty eleven in Ontario so a very significant increase in the number of patients living with IBD in the province we also showed this graph shows the incidence of IBD in pediatrics and adults the line graph shows that clearly like we see in other places the peak onset of IBD is really about the third decade of light life people in the 20s 20 and 30 year olds however the incidents shown in the bar graph here is rising most rapidly in children under the age of 20 and particularly rapidly in children under the age of 10 this is for Ontario Ontario data and so a very important finding in Ontario we also confirmed that Ontario has amongst the highest incidence and prevalence of IBD anywhere in the world and if you look at this systematic review published by Gill Kaplan in about twenty twelve Ontario is that white spot if you I guess there’s no pointer here but that white spot right in the middle of Canada so obviously the big red area of the United States right above that right above New York is Ontario at the time that this was published there was no information on the incidence of IBD in Ontario it wasn’t available yet but now we know that the incidence ranges between about twenty one per hundred thousand and twenty six per hundred thousand making it fit very well in that graph that red graph of the highest peak incidence of IBD anywhere in the world so what about other provinces so we showed that the incidence is rising in Ontario and if you go back a little bit I didn’t focus on this but the incidence is really rising across age groups there’s no specific age group where it’s decreasing and the incidence was significantly rising in the pediatric population but also in adults to age about thirty to sixty years old and the other other age groups were not significantly rising other provinces did not necessarily show the same findings in Quebec and this paper was published in the same issue of IBD journal as the Ontario epidemiology you see the same peak of incidents in the 20 to 30 year olds however they actually found that there was a significant decrease in the overall incidence of IBD and that decrease was pronounced and statistically significant in Chrome’s in 40 to 50 year olds and in ulcerative colitis in 20 to 30 year olds 40 to 50 year olds and 50 to 70 year olds now when they looked at their pediatric incidents under 20 years old they were not able to demonstrate a significant increase or decrease so they weren’t powered to show a difference so clearly a bit of a different finding in Quebec than in Ontario and those are neighboring provinces right next to each other in addition Nova Scotia which is a much smaller province on our East Coast has also showed a significant decrease in the incidence of IBD in their province they also were not able to show a change a significant change in the pediatric age group mostly because they had extremely small numbers so it’s a much smaller province and their pediatric cohort really consisted of about 50 to 100 people so they weren’t able to show a significant change but overall the incidence was decreasing in IBD in Nova Scotia and so these findings prompted Gill Kaplan to write an editorial in that same issue of IBD journal and I urge you to read read it if you’re interested in the strengths and the weaknesses of using health & Mindy de but basically he wrote he compared the Quebec and Ontario studies and talked about the methodological the ways in which we identify patients the ways in which we follow them up how long we allow them to qualify so if it’s let’s say five codes over a given period of

time what is that given period of time in order to say that they have truly truly have IBD is at four years is it indefinite and that can change the incidence and prevalence of the disease so he talked about methodological indeed ax that may account for differences in findings and so based on this information we realized that we really need to get a better idea of what’s going on in Canada and so we developed the Canadian gastrointestinal epidemiology consortium or can geek because we’re all data geeks so we thought that was a cute little acronym we found we what we want to do is we really want to reduce the methodological atiba the same approach and a distributed network analysis approach in multiple provinces we can’t share individual level data across provinces because of privacy laws but what we can do is we can get to conduct the same analyses in each province and then meta analyze the results at accounting for the heterogeneity and try to find exactly what’s going on in Canada and this would provide a PDA shin based estimate a pediatric onset IBD the advantage of this is also our numbers get even bigger so you remember the volume of data is really important here we’re then able to show what the trends are over time in smaller and smaller age groups and so this is the first study that was it presented a couple weeks ago at Canadian digestive diseases week looking at the epidemiology of pediatric IBD across Canada using five provinces data and so you can see here that generally the provinces were about the same the exception was Nova Scotia which was shown to have a higher incidence of paediatric IBD than other provinces and that’s been shown before in one of charles bernstein study looking at five provinces at different five provinces and he looked at adult IBD but also found that nova scotia had a higher incidence of IBD we don’t really understand why and then when we pooled and meta analyzed the results we basically found that the overall incidence of IBD ranged from about seven point nine one per hundred thousand in 1999 went up to about ten point five five per hundred thousand in 2008 when all five provinces could contribute data and it was about nine per hundred thousand in 2010 when ontario and manitoba were able to contribute data and that number nine to ten per 100,000 is the second highest incidence of paediatric IBD under the age of 16 years old in the world the only higher that’s been described was in norway and that was described a few years ago we also wanted to look at trends over time and what we saw was basically for the 02 16 year old group that there were significant increases in the incidence both in ontario and quebec so remember that previously we did not show a significant increase but when we use the same methods and we extended the data a little bit longer quebec did show a significant incidents increase in pediatrics however that when you men analyzed all the provinces we could not demonstrate a significant increase the incidence rose by about 2.1 per hundred thousand but this was not statistically significant similarly crohn’s disease by about 1.7 per hun sorry one-point-seven percent per year not per hundred thousand 2.1 percent per year of follow-up between 99 and 2010 and ulcerative colitis as well went up but not statistically significant however one interesting finding when we broke it down by age group that the only age group was significantly increasing incidence was a zero to five year olds and were increasing by about seven point one percent per year of follow-up data so a very sharp rise in our very early onset IBD cases the other age groups were increasing by about two percent per year but it wasn’t statistically significant so let’s go back to Ontario and look at what’s going on within this increased incidence and when you graph Ontario incidents and pediatrics overall what you’re seeing is really that there’s a fairly sharp rise after about the year 2002 2001 2002 and it’s pretty pronounced before that it seems to be relatively flat so what’s going on after around the year 2000 that may account for this increased incidents that we’re seeing in Ontario so you’ve all seen this Venn diagram before the causes of IBD where you get you have the genetic susceptibility which only increases your risk a very small amount you have the change in the environment or an environmental trigger which might change the microbiome and trigger and dysregulated immune response so when we think about this let’s first look at whether genetic susceptibility has changed in the province so there’s over a hundred and sixty different genes that have been associated with IBD and most of them overlap between pediatrics and adults so not really anything there that would account for such a sharp rise of pediatric IBD specifically and were there any recent genetic changes in Ontario and I put this toxic avenger slide up for Joel rush who’s from New Jersey so I don’t know what it is but we don’t seem to see the toxic avenger and role in the same room together at the same time but there have been no nuclear accidents in Ontario which it accounted some massive genetic mutations that would account for this rise in pediatric

IBD maybe just in New Jersey I don’t know however what has changed is the genetic background of Ontario Ontario has one of the highest immigration rates anywhere in the g8 nations and what’s changed is that the biggest immigrant group in the 1990s were people from China and Hong Kong and they are known to have a very low risk of developing IBD both in China as well as in North America however more recently since about the mid-90s people from South Asia so primarily India Pakistan and Sri Lanka have become the number one group and now account for about thirty percent of immigrants to Ontario and we’ve known for a while from studies in the UK that people from South Asia when they arrived to the UK seemed to have a higher risk of developing IBD after arrival particularly ulcerative colitis more recently there was a pediatric IBD study from Vancouver in Canada which showed that children of South Asian descent had about three times the risk of developing IBD compared to children of Caucasian descent so we wanted to look at what was going on with immigrants to Ontario and their risk of developing IBD in Ontario were privileged that we have access to the immigration data for Canada and so we can link our health administrative data and on our Ontario Crohn’s and Colitis cohort to the immigration data to determine whether people in the province are immigrants or not as well as their country of birth their socioeconomic status even their education level when they landed in Canada and where they lived when they landed in Canada so we linked the Ontario Crohn’s and Colitis cohort to the immigration data to determine the risk the incidence of IBD in immigrants to Canada separated by region of the world and the focus on south asian immigrants published in PLoS ONE last year showed that really both south asian immigrants and immigrants from other areas had a lower risk of developing paediatric IBD and adult IBD than then nonimmigrants the children of non immigrants who are born born in ontario and that was about half the incidence of the children of nonimmigrants when you looked at the Ontario born children of immigrants so their parents arrived to Ontario and then they had children born in Ontario in fact the South Asian group had the same incidence of developing IBD as the non-immigrant group so something about South Asians are triggered when they arrive to Ontario something environmental is triggering risk genes to make them have that high risk of developing pediatric onset IBD that other Canadians have whereas other immigrant groups don’t that the environmental risk factors are not triggered there’s nothing triggering the genes / probably and they continue to have a low risk of developing IBD and we also found that every decade earlier in life at which you arrive to Ontario you have a ten percent increased risk of developing IBD after arrival and that’s shown here in a different way with an incident rate ratio if you arrive compared to the over 60 year olds the children about 10 under 10 years old had about three to three and a half times the risk of developing IBD if they’re immigrants and they arrive between 0 and 9 years old compared to people who arrived when they’re greater than 60 years old so again something about being exposed to the Canadian environment earlier in your life increases your risk of developing IBD and the question is then why and is it something change in the microbiome that happens early in life is it hygiene is it exposure to antibiotics delivery method so see sections have been implicated with the developing IBD so clearly lack of sunlight exposure in Canada were all pretty much vitamin D deficient unless we’re supplementing other environmental risk factors what’s going on that’s triggering these genes in certain people but not others and so again thinking about this new translational approach we take a finding that we made it in the health system and the population level and bringing it back to the bench and the bedside to try to find an answer and two groups now in Canada are working on it one is the Canadian children IBD Network which is inception cohort of almost all the children diagnosed with IBD in Canada across all 10 provinces 12 centers and the Gemini project is a toronto-based project that’s looking at the risk of many autoimmune diseases in South Asians who live in young young adult South Asians who live in Toronto and they’re looking at environmental risk factors using a modified Phoenix questionnaire which is a validated questionnaire for environmental risk factors looking at diet looking at whole exome sequencing RNA sequencing metabolomics and intestinal microbiome and try to in order to try to understand exactly what’s healthy happening in South Asian people that is triggering this disease so really bringing it back to the patient level and back to the basic science level to try to find an answer for these findings that we made at the population level so going on in the Venn

diagram looking at possible environmental triggers that could explain this increased incidence of IBD that we’re seeing in Ontario there’s this is a systematic review that we published a while ago but you don’t have to read all of this this table goes on for three or four pages and this is just the environmental risk factors associated with pediatric IBD so there’s been many many many studies that have tried to look at environmental risk factors the key here is that we need to focus we need to look at what early life exposures knowing that the earlier life you arrive in Canada the trigger of the disease what early life exposures trigger it what’s the point of greatest risk so perhaps we can intervene to prevent it and what’s the biologic plausibility of developing that is of you know this environmental risk factor causing the disease so Gil Kaplan took this approach when he looked at the thin database in the UK which is a primary care database comparing Crohn’s to match controls and interestingly he found that looking at air pollution levels in the region in which these patients lived overall there was no increased risk if you lived in highly polluted areas however if you looked at young onset IBD there was an increased trend in patients exposed to nitric dioxide and particulate matter so only in pediatric onset disease not in adult onset disease so what he did then was he took it back to the bench collaborating with Karen Matson in Edmonton and to decide what’s the PI illogic plausibility of air pollution increasing your risk of developing IBD and what that group did is they fed a mouse model particulate matter one of the factors that may increase the risk and found that it increased pro-inflammatory cytokines and it decreased microbial diversity and changed the microbiome of these of these mice not only that a separate study looked at early life exposure specifically and found that it was really only that early life exposure that changed that changed the microbiome and resulted in more colonic inflammation in this mouse model of IBD so again stressing that something early in life is causing these symptoms causing these diseases and it may not be just air pollution may not be air pollution at all but we need to focus on early life in addition other studies have looked at urban versus rural environment there’s been any studies that have looked at that and found that urban environment increases your risk of developing IBD overall particularly Crohn’s disease but also ulcerative colitis however there’s a lot of heterogeneity between studies so we thought as a group for can geek what could we add to all of these studies that have looked at rural versus urban environment when we looked at Canada and we tested different definitions of reality in order to see whether they cause different IBD risk and then we wanted to look really at age of rural living so was there an age exposure outcome relationship that increased or decrease your risk of developing IBD and what we found was very interesting I think so in three studies that three provinces that participated firstly there was a lot of heterogeneity between provinces so there didn’t seem to be a significant difference between rural and urban in Ontario however rurality did protect you from developing IBD more crohn’s than you see again but both crohn’s and you see in manitoba and in alberta and that may be a function of Ontario’s very clustered around cities so even a rural definition environment may not be all that rural also it might be the type of farm so in Alberta it’s mostly be farming in Manitoba it’s wheat and grain and in Ontario it can be a mix of wheat and grain and fruit and other things so there’s heterogeneity amongst provinces that we have to address but another interesting finding was that the relationship was most strong in pediatric onset IBD again so exposure to the rural environment mostly protected you against pediatric onset IED whereas adult onset IBD the relationship wasn’t as strong so again early life exposure so the next step really is going to be to look at the microbiome the exposome epigenetics other factors biologic factors that may lead us to understand these environmental risk factors and how they’re changing our bodies we’re actually doing a study now with microbiome in Ottawa to look at rural versus urban to see whether there are different microbial diversity different clustering of different species in rural versus urban children with who develop IBD the exposome is the idea of tracking everything you do so wearable technology to basically give tons of data and tell you what environmental risk factors are coming into your body and might be affecting it and a lot of work is going on here in California on that and then I think you know we’ve been so far restricted to sort of a hypothesis generating hypotheses generated so you go in with a hypothesis and you test it and that’s partly the fault of people like me I’m a clinical epidemiologist we were kind of trained to ask a question design the study and test the the hypothesis but I

think more and more the computer science folks are developing new ways of using big data and data mining and instead of testing hypotheses it’s going to generate hypotheses in order to you know give us new clues that we wouldn’t have otherwise thought of to do IBD and we have a PhD student now as well as a computer scientist who’s going to be looking at the Ontario data using data mining science basically to look at early life exposures early life risk factors and the development of multiple autoimmune and immune based diseases like IBD asthma diabetes ms hoping that we can generate new hypotheses for risk factors that are associated with these diseases so in conclusion I hope I showed that translational research isn’t just bench to bedside anymore we really can integrate population level data and big data in order to answer new research questions and that we really have to look beyond traditional research to take advantage of large link databases so that includes the health administrative data but it can also include different types of population-based data that are available and i think that the future really is that model that they should that i showed you in edmonton and calgary where epidemiologists make finding with large databases however they then bring it back to the bench bench or to the patient level in order to understand the biology behind the findings that they’re seeing at the population level and i’m happy to take any questions if anybody has any

You Want To Have Your Favorite Car?

We have a big list of modern & classic cars in both used and new categories.