Uncategorized

afternoon everyone my name is Roy Ben Alta I’m business development manager at AWS and thank you for coming to our session the life of a clicker what to expect from the session so how many of you are doing click stream analytics today nice how many are using a dupe for that redshift Kinesis okay how many are not doing it but they’re looking it from business perspective how to do it interesting okay so we’re going to talk about common patterns for click stream analytics give you some tips how you utilize EMR Kinesis for doing this type of workload on AWS and for a man entry will have Rick McFarland with the vice president of data services at Hurst talk about their journey of clickstream analytics at Hearst so click stream analytics it’s really means business value regardless which vertical you are working today whether it’s consumer online e-commerce can be banking if you heard Capital One today moving to AWS using their digital platform Internet of Things we want to identify what our customers are doing on our digital assets so when we say click stream it can be a click for advertising that they using later on to aggregate data and analyze them looking about key performance indicators that we are capturing and other type of use cases that are more advanced are things like recommendation engine so I’m on the website I buy something you want to do an upsell and suggest the next product or put the right advertising promotion banner to the customer and what customers are doing they analyze these collections of web logs or click streams but there are some challenges and the challenges are around the velocity of the data so some of you are familiar with Omniture web logs going to web logs your homegrown web blogs that you generate it can be in different format it can be a JSON it can be it is V CSV Avro you know many and there are many types of data that and the volume is massive and you need to analyze it I always like to give this example how many wrote may produce in their life when good so you remember this one hello world first programming in Hadoop doing word count you have a nice 1 gigabyte of text files you write your program you know how many words I have in my large object but this is really to understand how my produce works during the search raffling but the real project or the real first project that most of the customers that I worked with this is what they start they’re a collection of web blogs and they get really messy web logs you have cookies you have query you need to do I pick config you need to clean that data and in the end of the day many of the customers what they do they really not create nice visualization or aggregation and this is really the basic thing I’ve seen with within the last six seven years that customers are doing and I’m going to share with you how it evolved during the years so the first pattern and this is very common use flume or Kafka landed data on HDFS we create five tables accessing terabytes of data we run a query visualization tool over your command line and you can go make yourself a coffee the query run for a long long time this is how it started see people smiling you’re familiar with that so the next phase was okay it’s quite slow so let’s aggregate it we’ll use big we’ll run some ETL will build the aggregate data and we’ll push it back to our data warehouse we have customers that loading this set so to redshift iso customers that loading it to Vertica and actually the the load they get low latency when they retrieve this type of data and they offload really the DTL to do it in two

but the next generation is really using HDFS now customers look okay I have web logs but I have some reference tables I want to use scoop to push that also to HDFS and I don’t really need to do detail there are some really nice applications so Clara came up with Impala spark sequel presto last week strata conference in New York I think there were over five companies that are doing the sequel engine on a dupe each one of them said that they have the best performance which is funny but eventually we are like rewriting databases in a nutshell so you have tears and different type of improvements to query HDFS for further developments so how it looks in AWS what customers and this is very common pattern that we see what customers are doing today we have the collection of the web logs usually you have an agent sits on on a large server you collect all the data you can use Kinesis stream and write a KCl and dump the data into s3 then we run with VM are the cleanings we’ll use it as an ETL grid we have really nice data set and we can send it back to s3 because like for example if you are an advertising company all you do is the output you send back to another agency or you have another customers that you need to push feeds today we see a lot of integration between companies that instead of sending FTP they use bucket policy and they just give them access to the output bucket when they generate the aggregated data set or you can have your data warehouse using your bi and loaded aggregated data set to redshift for analysis but it’s not enough why it’s not enough because as I said in the beginning the the click stream has value and business wants to know the answers now so traditionally if you’re looking on web logs many customers move like once a week to once a day to maybe twice a day but today we want to analyze it continuously you want to use the click stream to create actions now and provide your business visibility of what are the users are doing if you’re advertising how you do like impressions systemization in real time this type of algorithms that you want to know in a near real-time and we use real time which is really near real-time we want to get to the minute second level of what happens so from the time the customers click or run any transaction on your digital assets or when you get the data you want to be able to analyze it and process it and really understand there is value of that so what are you doing the micro batch aggregation or building at time windowing serious this is where we see the industry moving from really batch and real-time to near real-time but there is a hybrid approach you still have massive data that you load once a month but really in the context of clique streams and web logs we see the trends of really moving into that notion of near real-time so just to recap how you do that so we’ll talk about Amazon Kinesis streams as you heard today in this keynote Amazon Kinesis streams is managed service that the streaming ingestion and provide you continuous processing on your streams what unique with Kinesis that you can have multiple applications that process the data on the process the clickstream data or any data that is blob and you can have one application loaded to s3 as your data Lake you can have one application that loaded to redshift one application can you can use the Ammar to process the data or dynamodb or we have custom application that customers are building uses the Kinesis client libraries another major component is the elastic MapReduce amazonia mark it’s a managed service that was introduced back in 2010 provide you a dupe as a service we had major change within the product and we actually reaaargh attacked and we’re using the Apache big-top as the engine and Apache spark is a first-class citizen within iam our recap on apache spark it’s a distributed energy in the transient memory and it has spark or it’s written in scala and and it has spark sequel and spark streaming that provides today to

customers to do different things with much better latency when they did with hive or weave the paradigm of MapReduce or a Dupin and it really changed the economic so when you use spark if you’re having a MapReduce heavy i/o intensive you use the ephemeral storage of your instances when you use spark job it’s running in memory that means that you’re doing much less IO and you’re doing more compute or memory intensive and this is where the benefits from cost of processing clickstream coming to the picture how many are using spot instances with VM are how many are not using at all ok good so spot instances are the way for you to bid for compute hour and you can use it to VM R so if you remember VM are you have the concept of master master core and task nodes where the difference the core node has a ephemeral storage that IH DFS means if the cluster if snow this down you’re losing data if that’s not you don’t store anything he’s just giving your horsepower to your processing so if we do the math I’d say that you have 10 nodes cluster the transfer 14 hours and it costs you 1 dollar per 1 hour means that for 14 hours you will pay $140 now we’ll add 10 more nodes the training on spot and we will be that for 50 Cent’s that means that your cost for 20 nodes cluster for 7 hours because you added additional 10 nodes you will pay $70 for the 20 core nodes and you will pay $35 for the ones that running on spot instances total 105 dollars and this is really rough math we have customers that save over 70% when using the export instances in processing so less time less cost you spot especially if you have spark jobs and especially if you are utilizing a job today and we also found that many customers are using a dupe just as an ETL grid they clean the data and they load it to somewhere else and why do you need the persistent to do cluster to do this type of activity when you can use s freeze your HDFS or use your data Lake and have multiple EMR cluster processing the data and really focusing on your insight so how it looks today and we talked about batch and new real-time and interactive processing so you use Kinesis you use spark streaming with Amazon Kinesis and I will give you some tips about how these two together really works nice and you can have multiple EMR clusters for data scientists so today very common and we see it in many occasions we have customers that they have a data science team and they have many analysts they like to use R and different algorithms and they want to access all the data because you want to have a data driven company and customer to access data so the day of a life of a data scientist they come in the morning it would create his own cluster using spa instances when he’s done his job his work he can go home he terminate the cluster and that’s it you pay for what you use and you don’t need a purchased impostor and have start having problem with one cluster multiple users using the same jobs and get into some race condition etc so this is very common like you can install a a notebook Jupiter Y Python and write additionally if you need if you have a data warehouse bi needs you can load the aggregated data set into redshift and use visualization you can use tableau or whatever your preference or what we just introduced today and announced today that’s something that you can use as well the trick of the pixels so this is quite interesting so when you use Kinesis and let’s say that you have a webpage and you want to track some of the transactions on on the page right one of the tricks that we see that many advertising companies and I see our customers copying the same method they use the pixel HTTP GET how many are familiar with this trick like okay we have many other guys here so what you do is really you you create a small JavaScript that runs on your page and you request from the server give me a pixel it can be less than one byte and what they do they concatenate all the information that I want to capture of about the user like which browser is using up to which transaction and take all that data and push it into your HTTP

server and you will get a response but you can capture the clickstream data how you do it with Kinesis so customers what they do they use elastic beanstalk to create an application that it’s like a Kinesis proxy and that will filter because if you have scenarios that cost that someone crawl or you have bought so you can filter those but you can use really to centralize your clickstream ingestion process and run it through Kinesis that’s one trick because then it’s a data record that you push and it’s a blob and you can convert you to Jason if you’d like to work with Jason format or any other format so I told you about some tips so Amazon Kinesis applications when you build it so there are bricks they created in ASL for Canisius you can use maven and just build it but you can use spark streaming with Kinesis we’ve very easy to process the data on your stream using Scala or PI spark to process data on the stream the concept of shards if you are not using Amazon Kinesis fire hose that we announce today we always recommend that you’ll have a head room for catching up with that tiny stream so right now you know data in Kinesis stream you store up to 24 hours so I’d say that you have yet to catch up so you want to have more shards just in case you want to have a backlog and of course when you use KCl it uses DynamoDB for checkpoint so give name that is unique to your application so you won’t have any issues like because it’s a one to one it’s a application X so that’s the name of the table in Kinesis and of course provision the throughput now the table of dynamodb is created automatically it’s not something that you need to maintain or create it manually when music spark on EMR so as I said Apache spark is a first-class citizen with Amazon EMR and starting version free to take you don’t need any more bootstrap to build spark up and it’s coming native with the new release of EMR last week EMR 4.1 dodgy row we have spark 1.5 which is the latest release you can use yarn caster of course if you have multiple nodes that you’re running it and other tips like serialization use Kairo parameter so that’s some tips again this is a 300 level session so I didn’t go and explain like what is this we assume that you all know but I would like now to invite Rick McFarland with vice-president Hurst that you will talk about there experience we’ve clicked zoom analytics and learn about the life of the click it hurts thank you very much thanks Roy no thanks Amazon thanks for joining me today and yeah giving me the opportunity to share with you some of the stuff that Hearst is doing with AWS but in the theme that Roy started let me start by asking how many of you have ever heard of Hearst before how about Hearst the company lost some hands there how about Hearst before you saw that this meeting was coming up good so I think what I’ll start with is a little background on Hearst because it does give us a nice context for how we developed our clickstream so Hearst is actually a collection of 200 companies in over a hundred countries we are a magazine publisher you may recognize some of these brands we have 20 US titles and over 300 international editions we’re a television broadcaster we have 31 television stations across the country you may recognize a few of those we’re a newspaper publisher this is how we started 128 years ago some some pretty cool brands like the San Francisco Chronicle and what’s also interesting is we’re also a business-to-business data provider we provide business businesses with data in the medical space the automotive space and the finance space see if this works so a lot of us a lot of people refer to Hearst as a publisher but I may be a little bit biased because I run the data team I actually think of Hearst as a data creation company our data our creators of data are create the data that fuels the apps and

the tools that you use every day our data creators give you the information you need to plan your day the weather at the school closings our data creators give you the news they also give you the culture and some fashion advice and everything you ever wanted to know about Kim Kardashian as well so Hurst is a collection of companies and each of these companies operate independently and generate terabytes of data every single day we actually have measured it we actually collect a petabyte of data a year from all of our different sites three years ago Hurst established a central organization called the data services team and we have a very simple mission our mission is to ensure that Hurst as a corporation leverages all of this combined data that it generates so our challenge our challenge is or still is is actually unifying all of the data streams that all of our individual companies create we also wanted to develop a platform for analytics and and product creation for the rest of the company to use and we did this using AWS resources and I’m going to share with you today the pipeline that we call it the data pipeline that we generated in a very short amount of time and and I’ll share with you the experiences that we had along the way and hopefully if you’re making one yourself you’ll get some tips or tricks and maybe be able to do it quicker than we were but sometimes it’s the journey that’s important right before I get into the pipeline and give you all the details on the Amazon architecture a lot of the pipeline’s that we design or have does motivated by products the products that actually are at the end of the pipeline that used the data so I want to show you the one of the major motivators of our pipeline which is a product that is used by our editors it’s called buzzing at Hearst actually it’s on display at our booth one one five six in Hall C if you’re curious and you want to play with it so imagine a yahoo for editors but it has to be instantaneous and it has to be predictive and we’ve been using this tool for the last six months and the results have been very interesting and quite quite impressive based on the fact that we’re incrementing our clicks our number of page views by twenty five percent because what the editors use the tool to do is to actually take a piece of content that’s actually trending on one of their sites and circulate it and syndicated across to all the other sites so we’re actually leveraging the power of our three hundred and fifty sites to circulate articles and get incremental click rates incremental pageviews at twenty five to almost 30 percent which if you’re if you’re a revenue person that that was those numbers translates to a nice increase on revenue I can’t tell you how much that is but so basically it’s a very valuable tool which has some very distinct requirements that were brought back to the data team and we as the data team we broke the requirements from the developers down into our engineering requirements and these engineering requirements are the ones that actually dictate the structure of our data pipeline and I’ve kind of highlighted the main ones the number one the number one and two goals of this pipeline is it has to have a throughput goal which is how much I always think of throughput and latency is how much money you can make and how fast you can get that first dollar so the throughput goal is how much we have to be able to get all 350 websites to funnel their data to through this pipeline that’s very important because otherwise you’re not leveraging your whole your whole capabilities we have a latency goal we need to get it from clicked in five minutes and actually that requirements down to two minutes now so that’s how fast you can get a dollar latency it needs to be agile the tool needs to be able to be easily and quickly the pipeline needs to be able to be quickly changed if we add some variables or change things along the way as you know some data pipelines are very brittle you get what you get and you’ll like it but this one has to be agile so you can easily add new elements at the beginning of the pipeline and they have to be able to quickly go through all the way to the end we have some very unique

metrics the data science team is focused not on lagging metrics on reporting page views or clicks that happened yesterday or an hour ago we want to build some really cool funky functions or or models that actually are predictive and actually say I think this article will be trending in an hour from now or in two hours you know predict the future leading metrics we have data reporting windows the editors want to see what happened in the last hour the last day and the last week in the last month so it’s not a real time tool like a lot of like a lot of you may be thinking of it actually has to be able to look at a whole bunch of really big windows of time we have to be able to our front-end developers are building this thing from scratch you can’t just buy the YouTube the the Yahoo website model out of the box so the end of the pipeline which is an API has to be very flexible about what that API endpoints can be or look like so that they can feed the requirements of the front-end guys because they want their front ends to work as fast as possible and so their requirements on the API or just give them me the data exactly this way now here’s the big problem the hardest problem so we have all these 350 web sites operating individually the whole process the whole data pipeline has to be implemented without affecting any of those guys day to day operation so that’s actually kind of hard to do if you can’t shut down a website or or tell them to add things so let’s look at the assets I had to work with three years ago when we first started off this project we actually had a very static clickstream collection process in place wouldn’t be my static is that we had the ability to collect data on some of our websites via clickstream but it happened at a very at a daily basis as a daily batch and it would update and get put into a Nativa data warehouse which we could query very nicely the Tees is great for that the problem was is that we could do only ad-hoc queries and it wasn’t production production Eliza ball and we’re only cranking about 30 gigabytes of data a day through there one other very important asset that we had which is my first will be one of my first tips we were able to implement on all of our websites a tag management system for those of you not familiar the tag management system that’s basically a container or a bucket or JavaScript that goes on the webpage and it’s a container where you put all of your tags that’s what it’s traditionally known as but what a tag management system really is to me is this as data it’s a district it’s a code distribution mechanism by putting this tag manager on every one of our pages it centralizes the ability to distribute tags or JavaScript to all the web pages instantaneously so I’ll get back to that here in a second so before I show you the full data pipeline I’m going to take it apart because it didn’t get created in one day it actually rolled out over time the first part of the data pipeline if you’re going to build one is you got to start with ingest you have to be able to collect your data and the way I’m going to do this the rest of the presentations I’m going to show you of the pipeline and then we’re I’m gonna give you a bit of code for those of you that are want to take something away I would want to have a little bit of cool code snippets whenever I go to one of these we’ll have a little bit of code snippets with some tips so the beginning of our pipeline was the ingest and as I mentioned earlier well first of all we have to implement the JavaScript and all the sites so Roy alluded to this earlier the HTTP request we have to implement JavaScript on all the sites to be able to start transmitting data into our click stream and we have to do this without interrupting their day-to-day so because we had a tag manager on all of our web sites and that tag manager has a has the ability to essentially I can give code to a one single person centrally and that person can distribute the code without asking permission necessarily we usually ask permission but you can actually put the code on the page without the websites knowing it so the first tip is definitely if you have a very big network of sites to implement a tag management solution the vendor we

used is insight and a good partner we then do implemented an elastic Beanstalk with nodejs which exposes the HTTP endpoint which roy alluded to earlier in his presentation so we’re transmitting that the data off of a page that we’re interested in and staking out on an HTTP request and transmitting it over via nodejs and that nodejs is putting it onto a Kinesis stream and this was in the days before the announcement of firehose so we had a KCl library also which pushed the data into an s3 bucket so I’m basically trying to create a data Lake in s3 of all of our clickstream data my first job is to get every action on every one of our websites stuck into s3 here’s a nodejs snippet if I know it’s hard to read but if you’re this presentation would be available later on the Internet and basically some key tips here I’m gonna have to speed up I think it’s important to get a good partition key so that you evenly distribute all of your data across all the shards if in your Kinesis stream you want to make sure you do a synchronous cycle so you don’t interrupt the loads on the page so you have to make sure you have that asynchronous series call we also used the nodejs to put a server timestamp into the stream because the web timestamp the stuff that comes off the web server is unreliable and finally one of the most important things to do and this is an option but I highly recommend it is to use JSON formatting when you push your stuff on the stream this will allow you maximum flexibility down the road as you add new elements at the front of your pipeline which you inevitably will everybody’s gonna actually want to add more variables by using JSON you can easily expand that variable list without disrupting your pipeline later on the other kind of speed up through these the other great thing that we have is the ability to monitor with AWS we can monitor each of our elastic beam Strasse and beanstalks and our Kinesis streams they have nice monitoring features here’s an example of we have an auto scaling feature that triggers when we have more than 20 megabytes of data coming in it’ll trigger an auto scale and actually grow our cluster so we can handle the load but you can have really great monitoring via this these tools on AWS to keep an eye on spikes and stuff so the summary of phase one use JSON we use the HTTP call requests to get the minimal amount of code introduced on the page and we liked elastic Beanstalk in Kinesis because its auto scalable so as we began to ramp up our sights and got them all on board those they would auto scale up to handle the load and s3 was a great place to stick all the data for starters because it’s reliable and of course all of the other processes later on rely on s3 so now that we have ingest out of the way and you have to be able to have a very nice confident ingest in your pipeline you can focus on ETL which is the next phase of the pipeline the data that we got coming in from our websites is not clean and I’m sure if you have a data pipeline you’re you know whatever is put on the page is scraped and sent back and that can be have a lot of mistakes on it so ETL is gonna gonna have to happen and our first phase of ETL we were we decided to use EMR and we did Hadoop we decided to use EMR and and the reasons for this is because we we knew Pig and so you know I think sometimes that’s that’s how things evolve is that you use what your team can do and so we knew pig and so we could code in pig latin so we created a Hadoop cluster that would grab the data that we were putting into the stream from s3 clean it up and put it back into another s3 bucket and it would be the cleaned and slightly and slightly aggregated data now if any of you program with pig and you know you have to have a lot of UDF’s to work with it because it doesn’t have a lot of cool functions out of the box we ended up having to make 50 UDF’s in Python because again that’s our coding language of choice we could have done it in Java

but we like Python and that is very important and here’s a snippet of our Pig script some tips on if you do any coding in Pig I definitely recommend gzipping or getting your files gzipped so we we you use these first two lines here to get compression the next section down here is an example of how to call and the the UDF’s and a lot of times reg X’s scare people in other languages but the Reg X’s and pig aren’t so bad version 2 of our ETL so Along Came spark and one of the neat things about spark is it allowed us to stream the data directly from Kinesis and remove the writing to s3 and we could actually process it in real time and in memory the drawback with spark or streaming spark is a Scala that’s a language that not a lot of people know so I’m here to try I’m a convert because I’m actually convinced that Scala is not so bad and if you know Python I think Scala will not be that bad and the way we got it rounded is a very important little trick is we actually embedded sequel we took our existing pig scripts converted them to sequel and embedded inside of Scala sequel so we created a little sequel function right here’s an example and it is called by the Scala script and so basically we use Scala’s a wrapper and just basically did our ETL and sequel and this will become relevant later on and I think it’s a theme that I’ve got around sequel another thing is don’t be intimidated by this slide is here to show you a Python UDF and I seek Scala UDF basically the argument is they’re not that different here here’s some commands that are in Scala and some commands in Python so I’m trying to I guess speedily convince you that if you know a little bit of you know Python I think Scala is doable once the ETL is done you’ve got the data cleaned up and munge down a little bit we got to come up with those funky variables that are predictive and model-based and that’s where I call the data science section phase 3 of the pipeline and the way we started our data science is through an investigative process where we used SAS on an ec2 note a single ec2 node and suck the data from s3 using SAS and ran our stuff in SAS why SAS it’s because our data science folks are good at SAS everybody’s if you’re in data science most people know SAS and it’s great for guerrilla warfare and figuring out your models the problem was a single node of SAS sucking the data from s3 processing and pushing it back takes three to five minutes which breaks our two-minute time through SAS code example again I’m going to speed through this really quickly another SAS code example we embedded sequel we used another tip I’ve got for you if you’re gonna do stuff in SAS I know there’s a proc sequel camp versus a non proc sequel camp I definitely recommend trying to do stuff in proc sequel because again sequels might theme through the pipeline all these newfangled products that are coming out and if you try to interchange them the one nice commonality across all of them is that you can usually take your code and sequel and re embed it inside that new product so it was very easy if you did it in proc sequel to get us to the next step which I call the production version of our model so now once we’ve figured out the models using SAS here on the bottom what we can actually do is store those models those coefficients are those model parameters in s3 and then since I wrote most of the the team wrote most of the stuff in prop using proc sequel here comes redshift redshift can suck data from s3 Postgres sequel take the proc sequel put it in Postgres grab the models and we converted everything in our data science phase to use redshift and suddenly our processing time our data science processing time is a hundred seconds pretty impressive I think again I can’t say enough good things about how if you if you got something that you think is really fast or you and you don’t think that it can be beat I always recommend trying trying in redshift and see how fast you can get it going to redshift

because it’s always shocked me how fast redshift is it’s beaten some of our processes in Pais bark some of our processes in Scala redshift is it just really great at sucking data in and processing it as long as you can get it all into sequel and now with the UDF functions it’s opened up a whole new set of things I’ll skip my redshift code example because I don’t have any time finally once the pipeline has the data science part done it’s got these nice little neat files we now need to expose it through an API endpoint so the last step is to push it into an elastic search cluster for indexing and an elastic search is great for exposing stuff through and in through a through API endpoint and it’s great for searching data right the way we did it at first is since we had an EMR cluster up and running we actually wrote a little Pig script to push the data from s3 and EMR and that’s because people might not know this the elastic search has a really great jar that our Pig has a great jar that will read data directly in elasticsearch and preserve the formatting of the date of the variables so your date times don’t change I don’t know if they do having challenges reading data into elasticsearch but it can be challenging especially with formatting so this is an example of the jar that is available I think via Amazon and it’s really simple just to define your your elasticsearch cluster and then read the data into HDFS and then push it into elasticsearch all the formatting is preserved so our final step that was slow because obviously spinning up an EMR cluster and reading the data in HDFS and then removing it back is slowly slow so we just decided why don’t we push it directly from redshift we had redshift was actually being run in an ec2 I’m being called via an ec2 node we just pushed the data from the ec2 node into elasticsearch directly and this is a script example that we used and there’s a couple of tips on there on converting your JSON to the right format which is something that was a pain for us to figure out but this is our final data pipeline and all of its all of its time latency and throughput so basically we’re ingesting about a hundred gigabytes a day of data throughout the pipeline it’s being processed in milliseconds with the spark and then we’re doing our ETL in 30 seconds and our data science is being run on redshift in a hundred seconds and being pushed out to elastic through an API and the whole throughput is about a hundred and forty seconds from clique to it to a to a table to an endpoint this is a more visual representation of what’s happening in our process so basically the bulldozers shoving the giant rock and then kinesin our spark is smashing in a little pieces the data science bulldozers scooping it up and processing it and creating a diamond at the end that’s probably so to summarize the lessons learned these are the couple of node does but it took us a while to figure this out with our v1 you know we had a lot of storage steps with s3 we kept writing the data if you can remove the storage points speed up processing processing obviously within one of the stages and then finally combining the stages which is hopefully down the road will ultimately speed us up to even faster I think should we open it up we’ll be outside for after the presentation for more questions but if you want to stay we’ll we’ll open it up for some questions if you have any so we don’t won’t have time to talk about your data science so stupid yeah so come meet us weekend Hearst booth but yeah we’ll be at the booth one one five six hall C and well we can talk about all this stuff and you can play with the demo tool that I didn’t really get to show you very much of yeah I think I think just like from conclusion when we worked on the click streams think we provided you know we’ve Hearst with the technology to do that type of journey this is the reason why we call it the life of the click it’s really the journey of organization enterprise organization when they start processing vast amount of data and you don’t need an army of resources to do it because most of the services are fully

managed so using that really creativity the pipeline and I don’t know if today we announce some quick insight so now you’ll have end to end like I think my data flow and visualization as well so that’s your boot one one five six so if you want more information and Hearst is hiring as well so so sorry that the demo didn’t work out we will fix and I like the the fact that you have the quick tip so you can later on print or download this presentation and really use that for your day-to-day work so thank you very much guys thank you

You Want To Have Your Favorite Car?

We have a big list of modern & classic cars in both used and new categories.