00:00:02.669 --> 00:00:15.269
Michelle :: Webinar Producer: Hello everyone and welcome to today's webinar titled practical trends and big data and analytics. Our speaker today is miles brown senior cloud and devOps advisor at exit certified tech data.
00:00:16.049 --> 00:00:28.350
Michelle :: Webinar Producer: With over 20 years experience in the IT industry across a variety of platforms recognized as an AWS authorized instructor champion and a Google Cloud Platform professional architect and instructor
00:00:28.800 --> 00:00:34.830
Michelle :: Webinar Producer: Miles has delivered award winning authorized it training for the biggest cloud providers.
00:00:35.790 --> 00:00:48.450
Michelle :: Webinar Producer: Before we get started with the webinars. Let's cover the functionalities during the webinar. Everyone's microphones will be muted. So if you have any questions please find the Q AMP a box at the bottom of your screen.
00:00:49.050 --> 00:00:59.580
Michelle :: Webinar Producer: Enter your questions in that Q AMP a box at any time. We'll try to get to them during the webinar. And if not, we'll have a dedicated question and answer session at the end of the lecture.
00:01:00.540 --> 00:01:13.800
Michelle :: Webinar Producer: Today's webinar is being recorded and will send a copy out to each of you. By the end of the week. I'm also going to share a promo at the end of the webinar, so stick around to learn a little bit more about that. Alright, Miles, you can take it from here.
00:01:14.550 --> 00:01:16.200
Myles Brown: Okay, thanks. Michelle.
00:01:17.250 --> 00:01:26.700
Myles Brown: So everybody. My name is miles brown. I work at exit certified, which is part of tech data. And what we do is authorized training. And so we're partnered with
00:01:28.590 --> 00:01:38.430
Myles Brown: Over 25 major vendors, you know, the big cloud vendors IBM Oracle VMware cloud era data bricks. You know, a lot of different vendors
00:01:38.940 --> 00:01:53.910
Myles Brown: Primarily lately I've been more of a cloud guy, but I really was more of a big data guy before that. And so really you know on the cloud side I'm dealing with data and analytics a lot and I still keep
00:01:55.680 --> 00:02:08.760
Myles Brown: Keep watch over the big data classes, the cloud era classes, the data bricks classes and I'm in constant contact with our instructors there and I've interviewed a few of them to get some insights for this presentation.
00:02:09.300 --> 00:02:18.510
Myles Brown: So this presentation is going to be about practical trends that we see in big data and analytics. So we're going to start with just a few definitions, make sure we sort of level set. Everybody knows what we're talking about.
00:02:19.320 --> 00:02:24.690
Myles Brown: Then we'll talk about forward looking trends and one of the things I find is that if you just go in Google, you know,
00:02:26.010 --> 00:02:36.330
Myles Brown: Trends in Analytics, you know, 2020 what you really get are, you know, five years out trends. So hey, by 2025 80% of companies will have this or that.
00:02:36.660 --> 00:02:49.410
Myles Brown: And and it's you know it's always really forward looking, right. And so what we want to concentrate on more is the trends that we actually see happening, what, what have we seen the shift in our students in our big data type classes.
00:02:50.430 --> 00:03:04.980
Myles Brown: And then that shift from just looking at big data to really getting into machine learning and AI and then we'll end off just sort of talking about where to learn more and like Michelle said, well, have a good Q AMP. A at the end. So a few definitions to start
00:03:06.000 --> 00:03:13.320
Myles Brown: Just the idea of analytics is is it's sort of, you know, science of analyzing raw data to make some conclusions about that information.
00:03:13.800 --> 00:03:25.290
Myles Brown: And this is a new we've been doing this, you know, since the dawn of humanity, probably. But what's really changed. It's how we do this. And one of the things that's really changed over the last
00:03:26.490 --> 00:03:40.170
Myles Brown: 15 years 20 years is that we now have big data right that's these extremely large data sets that we we analyze computationally to reveal patterns and trends and associations and
00:03:41.070 --> 00:03:51.090
Myles Brown: I'd say 20 years ago there was only a few companies who truly had big data that's companies like Google and people like that who said, hey, we're the only ones that have these problems, but nowadays.
00:03:51.660 --> 00:04:06.000
Myles Brown: Everybody has big data, you know, if you have a website that's fairly popular just the click stream data of, you know, who's clicking where when you know you've got a lot of data that it's hard to deal with in regular traditional sort of relational databases.
00:04:07.050 --> 00:04:15.930
Myles Brown: And another big thing that's come along, is this concept of data science. Where were you sort of have to have some domain expertise. So you have to know something about the
00:04:16.290 --> 00:04:30.210
Myles Brown: The industry or the problem at hand, you have to have some programming skills and some pretty good knowledge of math and stats to be able to, you know, utilize machine learning and apply it on this big data and come up with these conclusions.
00:04:31.710 --> 00:04:39.630
Myles Brown: And so that's sort of where data science intersects with these concepts of AI and machine learning and deep learning and where big data sort of sits in there.
00:04:40.560 --> 00:04:46.350
Myles Brown: When we talk about AI versus ML versus deep learning, you know, AI is that idea that
00:04:46.890 --> 00:04:55.770
Myles Brown: Computer systems are able to perform tasks that normally require human intelligence, machine learning is a subset of that where you know the the
00:04:56.070 --> 00:05:00.840
Myles Brown: System's themselves that we build automatically learn and improve from experience.
00:05:01.500 --> 00:05:09.600
Myles Brown: And one of the things that was largely theoretical until just a few years ago. It's the idea of deep learning, which is a, you know, special kind of class of
00:05:09.900 --> 00:05:18.870
Myles Brown: Algorithms where what we're saying is that we're going to use multiple layers of progressively extract higher level features from Ryan put, you know, so
00:05:19.560 --> 00:05:25.170
Myles Brown: The ride, but it might be just, you know, frames of pictures and then we're sort of, basically, you know, looking at it.
00:05:25.710 --> 00:05:30.030
Myles Brown: Usually hate her lovely and saying, Okay, this is a person's face. Oh no, this is, you know,
00:05:30.780 --> 00:05:37.680
Myles Brown: Donald Trump's face or whoever and and so that's that's the kind of thing that we want to do. And so, to try and compare these, you know,
00:05:38.010 --> 00:05:45.780
Myles Brown: I saw a nice analogy, where we say AI is that idea of we can build a machine that can play chess based on the rules. Now you can teach it, the rules.
00:05:46.050 --> 00:05:52.800
Myles Brown: And then you can teach you some strategy, but it's got to take a programmer to code that strategy and added in as more of the rules.
00:05:53.520 --> 00:06:00.690
Myles Brown: Machine learning is where you build a machine that learns to play chess. By analyzing past chess games, right.
00:06:01.140 --> 00:06:08.760
Myles Brown: games that were played by humans. Right. So you're saying, Hey, look at this pass data and learn from it and the more data. We give it the better those algorithms are and
00:06:09.150 --> 00:06:17.070
Myles Brown: Deep learning is that new thing that, you know, it requires a lot more computation, right, which wasn't really available until more recently.
00:06:17.640 --> 00:06:25.980
Myles Brown: But that's where you build a machine that learns to play chess by playing itself right and running through more scenarios than anybody has ever played.
00:06:26.490 --> 00:06:37.680
Myles Brown: And so that's the kind of thing where deep learning is sort of that that newer sexier type of machine learning that that we see when we talk about analytics in general.
00:06:38.940 --> 00:06:55.590
Myles Brown: They're, they're sort of different types of analytics, starting with descriptive analytics, this is what we've been doing for 30 years I think probably the 90s is where this really started to get popular where we started to see business intelligence and data mining and things like that.
00:06:57.120 --> 00:06:57.660
Myles Brown: And
00:07:00.300 --> 00:07:00.990
Myles Brown: Oh, sorry.
00:07:02.580 --> 00:07:13.680
Myles Brown: Duplicate in this life. I'm not sure why. So descriptive analytics is it's very widely used. It's the simplest type of analytics. It's where we condense large data sets into smaller more useful nuggets.
00:07:14.160 --> 00:07:20.460
Myles Brown: And, you know, we've been doing this for a long time. I did my first data warehouse project in I think 9697 and there
00:07:20.850 --> 00:07:29.790
Myles Brown: You know, shortly after college, I was working in place and we were just basically grabbing data from our mainframe and from some other sources and putting it into an Oracle database.
00:07:30.210 --> 00:07:42.720
Myles Brown: And just to put it in one place summarized easy to get. And we've been doing that for a long time. So that's descriptive analytics, the next level. It's more predictive analytics where we're doing some sort of forecasting right
00:07:43.800 --> 00:07:49.710
Myles Brown: And we're we're answering questions like this, what will happen. Have some condition happens right so that's predictive modeling.
00:07:50.700 --> 00:08:01.230
Myles Brown: Why did this happen. That's where we do root cause analysis, look at data and say, Well, let's find correlated data and or forecasting. Like, what if
00:08:01.950 --> 00:08:14.910
Myles Brown: What if the trend continues like a Monte Carlo simulations pattern I navigation alerting you know there's all kinds of things that we can do where we're saying let's use the history to predict the future.
00:08:16.320 --> 00:08:32.790
Myles Brown: And so that's where most companies are now right we've all been collecting the data for a long time. A lot of companies are now trying to do this prescriptive analytics sort of predictive. The next level is prescriptive where we're really trying to say what should my business do. Okay.
00:08:33.900 --> 00:08:41.580
Myles Brown: This is a little more rare to find in a lot of organizations. Now of course there are companies that this is all they do is build software around this kind of thing.
00:08:42.540 --> 00:08:58.080
Myles Brown: But the idea here is that we're going to simulate the future under various sets of assumptions analyze all those scenarios and then suggest what should the company do based on this. And so that's that's kind of the different levels of analytics and I saw a really good
00:08:59.760 --> 00:09:09.660
Myles Brown: Sort of anecdote that kind of described it, and it was this idea that, let's say we have a lioness, and you know she needs to go and hunt for creatures in the jungle.
00:09:10.590 --> 00:09:18.810
Myles Brown: Actually, probably not in the jungle. But what area. So she hires a data scientist because I don't know. She's got money to spare. And she's like, kind of lazy.
00:09:19.890 --> 00:09:27.510
Myles Brown: And so the data scientists, you know, if they studied the historical data and provide the Linus with a report of. Where did you find your prey in the last six months.
00:09:27.750 --> 00:09:32.340
Myles Brown: That will help her decide where to go hunting next that's descriptive analytics right that's
00:09:32.730 --> 00:09:41.280
Myles Brown: You know, building a data warehouse with all the past data and making it easy to digest so that the, you know, stakeholders can make good decisions and
00:09:41.880 --> 00:09:52.950
Myles Brown: The next level is where the data scientist says, well, let me estimate the probability of finding pray at a certain place in time using some advanced machine learning techniques.
00:09:53.340 --> 00:10:03.480
Myles Brown: That's more prescriptive analytics telling her, hey, instead of you having to make the decision. I'm going to tell you where to find prey. You know where the probability of finding it at certain places on
00:10:04.470 --> 00:10:15.330
Myles Brown: The next level is where we do some sort of optimization. And that's where the data scientist says here's the routes that you should run through the jungle to take the minimum effort in finding your prey.
00:10:15.720 --> 00:10:30.210
Myles Brown: And and so that's where you actually say you should change your business of how you hunt to do it this way to take advantage and and so that's where organizations want to get to. There are very few that are really doing a lot of that.
00:10:30.660 --> 00:10:42.000
Myles Brown: You know, heavy duty optimization and then simulation of all kinds of, you know, options, but that's that's the ideal for a lot of companies for data driven companies they call themselves.
00:10:42.960 --> 00:10:50.790
Myles Brown: So I mentioned that, you know, there's a lot of forward looking trends. Whenever you go in Google. Hey, what are the
00:10:52.200 --> 00:11:00.540
Myles Brown: You know, make data trends or analytics trends or, you know, anything like that 2020 and it's always like, way out and so
00:11:01.350 --> 00:11:07.680
Myles Brown: Late last year Gardner put out there hype cycle for AI right so artificial intelligence.
00:11:08.490 --> 00:11:10.320
Myles Brown: You know, there's a lot of things that are in there.
00:11:10.710 --> 00:11:14.640
Myles Brown: And some of them been around for a while and pretty mature and others are brand new.
00:11:14.850 --> 00:11:22.170
Myles Brown: And so what we find is that there's this sort of hype cycle that most technology goes through where when it first comes out, it's, it becomes super popular
00:11:22.410 --> 00:11:27.090
Myles Brown: But then you get to this sort of peak of inflated expectations, where people say, oh, this is going to be the greatest thing.
00:11:27.480 --> 00:11:33.240
Myles Brown: And then once they learn more about it. Then they go into the trough of disillusionment, where they say, Oh, this isn't
00:11:33.840 --> 00:11:41.790
Myles Brown: The silver bullet that's going to fix everything right and then eventually we get to that slope of enlightenment into the plateau of productivity, where we say
00:11:42.120 --> 00:11:49.860
Myles Brown: Now I understand what is this technology good for what is it not good for and where, where will I be using this right
00:11:50.310 --> 00:11:55.860
Myles Brown: And so a lot of times what we see when we do those sort of a Google some trends.
00:11:56.250 --> 00:12:01.500
Myles Brown: What we're seeing is a lot of like analytics and AI technologies that aren't even close to being widely used.
00:12:01.800 --> 00:12:16.350
Myles Brown: And I think could be five to 10 years away from being really ready to go. Yeah. And so if we look the gardener cycle. I know this very small but you know the circles that are sort of white circles. These are things that are you know they're here now.
00:12:17.370 --> 00:12:24.270
Myles Brown: So things like speak right speech recognition right you got Alexa Google Home or whatever, you know, those things are working pretty well.
00:12:24.480 --> 00:12:36.480
Myles Brown: But it's not like everybody built their apps on that now there's more and more people taking advantage of those kinds of things. And it's likely that I don't have to build the algorithms myself right
00:12:37.980 --> 00:12:45.480
Myles Brown: GPU accelerators are all over the place. Right. So there's some things that were already there at the plateau of productivity. We know what they're good for we're using them.
00:12:45.990 --> 00:12:53.520
Myles Brown: But then there's some things like these orange ones that are more than 10 years away like autonomous vehicles. Remember how populated with all there's gonna be the greatest thing.
00:12:53.760 --> 00:13:02.370
Myles Brown: And then they started crashing a little bit and people started to say, wait a minute. There's some moral decisions about these, you know, and now we're sort of in that trough of disillusionment, where we say
00:13:02.730 --> 00:13:11.820
Myles Brown: You know, we'll figure it out, but it's not here yet. It's more than 10 years from now, before you know the road is full of nothing but autonomy scars and
00:13:12.990 --> 00:13:19.020
Myles Brown: And so, you know, all these different technologies are somewhere along this curve. And a lot of them are still fairly early on.
00:13:19.320 --> 00:13:29.610
Myles Brown: And still you know at least two to five years away, and sometimes 510 or even more. Right. And so that's that's all I wanted to say about that. I didn't want you to have to go and look through all these things.
00:13:30.840 --> 00:13:32.370
Myles Brown: But that comes straight from the Gartner
00:13:33.390 --> 00:13:37.380
Myles Brown: Report from last I think was last September's
00:13:39.330 --> 00:13:48.900
Myles Brown: So let's concentrate more on trends that we actually see right so we have customers. We do a lot of AWS training Google Cloud Azure.
00:13:49.230 --> 00:14:00.270
Myles Brown: As well as cloud era and data bricks. So we have a lot of analytics customers coming through. Not to mention, you know, IBM and Oracle and they have, you know, all kinds of analytics offerings as well.
00:14:01.050 --> 00:14:12.240
Myles Brown: As a p and SAS. You know, we've got a lot of people coming through our through our doors for our glasses now. It's more virtual classes, but generally we have sort of a mixture of virtual and in class training.
00:14:13.680 --> 00:14:24.330
Myles Brown: And one of the big trends we find is that everybody is already over that hump of big data. Everybody's got big data. Everyone's doing some sort of analytics. Now, right. Not everybody has embraced AI that month.
00:14:24.810 --> 00:14:31.080
Myles Brown: So if we kind of look, you know, the percentage of adoption here. Over time, you know, that sort of
00:14:32.190 --> 00:14:41.070
Myles Brown: getting to that point of a. Now we've all got big data. Now we're moving more into analytics, maybe even in day I were somewhere in here really at the beginning of what AI can do
00:14:42.240 --> 00:14:55.290
Myles Brown: But but big data, you know anybody who's got, like I said, a popular website already has that. And what I find is people coming into our classes. They've been collecting the data for a while they're, they're probably kind of putting it into a data lake.
00:14:56.430 --> 00:15:10.170
Myles Brown: Either their solution is already up and running, or they're moving it to the cloud, because the cloud fits nicely with the you know the pricing model and the high availability and all that good stuff. And so
00:15:11.190 --> 00:15:16.530
Myles Brown: We get a lot of people in our classes kind of moving into the cloud with with their data lake.
00:15:17.580 --> 00:15:21.900
Myles Brown: But that's been underway for a while. Some other trends that we're seeing.
00:15:23.160 --> 00:15:30.180
Myles Brown: Students in our classes. If we look, say, five to 10 years ago, you know, people were taking a class to learn spark.
00:15:30.990 --> 00:15:36.630
Myles Brown: Which part of, you know, the Hadoop ecosystem, how we build, you know, a big data jobs.
00:15:37.230 --> 00:15:46.590
Myles Brown: They were mostly like say 10 years ago just looking at it, saying, should we be doing this. Is this you know something of interest was five years ago. It was really a lot of developers coming through.
00:15:46.920 --> 00:16:00.240
Myles Brown: They were either Java or Scala or Python programmers and they were building batch jobs to do basically take all this data RAW format, clean it up and get it into a nice format that we can then do analytics on it.
00:16:01.350 --> 00:16:06.870
Myles Brown: Well, we really started to see is more of a shift. There's yes people are doing that, but that's mature and they've been doing it for a long time.
00:16:07.200 --> 00:16:14.220
Myles Brown: Then they started looking a lot more at streaming data right little pieces of data flying in from, you know, IoT devices or
00:16:14.730 --> 00:16:22.230
Myles Brown: log files, you know, instead of waiting for a whole log file as the log entries get generated. There's a lot of little pieces of data flying in
00:16:22.800 --> 00:16:27.810
Myles Brown: And so we started to see people take spark streaming, you know, pretty seriously, that's a way of
00:16:28.530 --> 00:16:37.470
Myles Brown: Using the same language and the same knowledge that developers would have for building batch jobs to build these little, you know, sort of mini batches. So streaming
00:16:38.460 --> 00:16:49.050
Myles Brown: Now we need somewhere to hold that stream of data, you know, could be millions of events coming per, per second. Right. And so we see the rise of something like Apache Kafka, or
00:16:49.380 --> 00:16:56.040
Myles Brown: In Amazon. They have Amazon cases which is very similar kind of concept. And so those starting to get pretty popular
00:16:56.460 --> 00:17:05.520
Myles Brown: Another shift that we started to see, you know, especially if I look in our data bricks classes. It was a lot of, you know, developers and stuff. And now it's a lot more data scientists
00:17:05.820 --> 00:17:13.230
Myles Brown: And those data scientists are doing sort of experimental stuff. So they're not just building programs that run on you know nightly job BTL
00:17:13.560 --> 00:17:17.580
Myles Brown: They're doing more exploratory analysis and so they use something like a notebook.
00:17:18.120 --> 00:17:31.200
Myles Brown: Like Jupiter notebooks and they're connecting to live Hadoop clusters and they're really wrangling data. They're cleaning the data and doing stuff with it. And so, that idea of data science has really grown over the last, say, five years.
00:17:32.220 --> 00:17:46.620
Myles Brown: And we see that the percentage of people who are data scientist is growing all the time. It's not growing fast enough, we will find and so we'll come back to that trend. But, but we are seeing more and more people who have those skills and
00:17:47.670 --> 00:17:52.020
Myles Brown: Probably that they get shipped we've seen over the last, say, five years is the movement to the cloud.
00:17:52.350 --> 00:18:01.470
Myles Brown: And not just analytics, like everything and it is moving to the cloud. But certainly, big data fits nicely with the cloud. If you think about it. We got really cheap storage there.
00:18:01.950 --> 00:18:10.560
Myles Brown: Which is, you know, nearly infinite and scalability. Right. You just add more data more data and you pay for however much data you store in there.
00:18:11.070 --> 00:18:19.230
Myles Brown: And it's a very highly durable, they generally will copy it around the different data centers so that there's not a single point of failure. And it's highly available.
00:18:20.760 --> 00:18:25.560
Myles Brown: And so what we see is that traditional sort of analytics companies like
00:18:26.250 --> 00:18:33.240
Myles Brown: Hadoop vendors like like Cloudera and Horton works. You know, they had to really embrace those cloud strategies in order to compete.
00:18:33.570 --> 00:18:44.310
Myles Brown: With the big three public cloud providers. That's AWS Microsoft Azure and Google Cloud because they each have their own managed to do implementations, they started to take a lot of market share.
00:18:44.790 --> 00:18:50.070
Myles Brown: We saw that cloud era and Horton works ended up having to emerge. You know, so now they have sort of
00:18:51.900 --> 00:18:58.440
Myles Brown: You know, they're very popular. If you're in that sort of hybrid where you have some things in your own data centers, you want to run some workloads in the cloud.
00:18:59.130 --> 00:19:09.240
Myles Brown: They've really embraced that but this movement of, you know, large comes to data into the cloud has been a big trend that we see and continues to be, and
00:19:10.050 --> 00:19:14.190
Myles Brown: Another thing that we find is that hiring a data scientist is really hard.
00:19:14.700 --> 00:19:22.980
Myles Brown: I remember we mentioned that, you know, they have to have some domain expertise, they have to know about the problem at hand. They have to have some deep statistics knowledge.
00:19:23.760 --> 00:19:31.650
Myles Brown: And they have to be able to sort of hack and program. Right. And so there's a lot that goes into being a data scientist. It's very multidisciplinary
00:19:32.190 --> 00:19:36.570
Myles Brown: And so hiring them. It's hard to do, there's, you know, you can Google any kind of article and say,
00:19:36.870 --> 00:19:42.660
Myles Brown: You know, hey, is there a shortage of data scientists and you'll see a million articles detailing exactly that we have so many
00:19:42.870 --> 00:19:51.360
Myles Brown: Fewer qualified applicants than there are positions available right and so what most companies are finding is that it's easier to train somebody in house.
00:19:51.960 --> 00:20:03.750
Myles Brown: You know that already, you know, hey, I took a math degree and I've got lots of stats knowledge and I'm a programmer. So I just need to learn more of the machine learning and more about our domain or maybe
00:20:04.440 --> 00:20:13.620
Myles Brown: They've been a statistic, you know, statistician or something like that and working in a company. And then we have to keep teach them sort of the programming part
00:20:14.040 --> 00:20:20.880
Myles Brown: And so that's what we find in these classes is that we're getting people who have two of the three and they just need to add the third
00:20:21.330 --> 00:20:28.950
Myles Brown: And so it might be easier to build a data scientists, then the higher one right now. I think that's changing right
00:20:29.790 --> 00:20:38.700
Myles Brown: Just like any new technology comes out, it becomes, you know, very specialized and they're very few people that can do it. And then over time you go to colleges and universities.
00:20:38.880 --> 00:20:46.560
Myles Brown: And they're turning out people who have some of these skills and they really just have to learn more domain expertise from that company right
00:20:48.330 --> 00:20:58.770
Myles Brown: Maybe in the future, machine learning expertise will just be another part, you'll say I'm a software engineer, that means that you have all this at your disposal, but I don't think we're anywhere close to that yet.
00:20:59.550 --> 00:21:10.230
Myles Brown: Actually because of this shortage of data scientists. What we're starting to see is that people are saying, Well, instead of trying to build the data scientists
00:21:10.950 --> 00:21:23.880
Myles Brown: Why don't we just go and buy a product that's already got the machine learning algorithms all built and trained and everything right. And so this idea of product dicing machine learning. Is it the trend that we see as well.
00:21:25.740 --> 00:21:37.560
Myles Brown: And so, you know, use a product that hasn't been learning built in to make the data scientists available to the masses, you know, sometimes people call this auto ML and so it's like sort of automatic machine learning.
00:21:38.370 --> 00:21:50.640
Myles Brown: There's a lot of options. If you look at the public cloud providers, they got natural language processing image recognition. Right. So instead of me having to go and hire data scientists that knows all about the stats.
00:21:51.540 --> 00:22:00.150
Myles Brown: And and the building algorithms to try and build to train and and build a model and train it and run in and test it validate it.
00:22:01.260 --> 00:22:13.290
Myles Brown: I can just go to a cloud vendor and say, hey, here in your Cloud Storage. I've dropped thousands of photos. I want you to go index those and tell me the ones that are
00:22:14.250 --> 00:22:22.200
Myles Brown: A photo of a person wearing sunglasses. Right. And it should be able to go and figure that out, right, because they built these kinds of models already
00:22:22.590 --> 00:22:33.840
Myles Brown: They've been using them internally at Amazon and Google and everywhere else. They basically just took those models that they built and train and said, Oh, here's an API, you can call you know for a price.
00:22:34.680 --> 00:22:44.310
Myles Brown: And so, those things are available to us. Now you can build recommendation engines image recognition and all that kind of stuff, but
00:22:45.540 --> 00:22:57.420
Myles Brown: Not everything in data science is going to be that easy right there are certain categories that are popular enough that they built an API and made it available to you. So the concept of just saying, Oh, we're not gonna have data scientists anymore.
00:22:57.750 --> 00:23:01.650
Myles Brown: I don't think that we're ever going to get there. Right. But for the most common stuff.
00:23:02.130 --> 00:23:17.550
Myles Brown: I think we're going to see a lot more things become API's and so just a regular developer, you know, you can put these API's in their hand, teach them how they work and then they don't have to know all of the heavy duty math underneath. What's going on right
00:23:19.110 --> 00:23:20.970
Myles Brown: So that's another big trend.
00:23:22.260 --> 00:23:32.040
Myles Brown: Some of the other trends that we're seeing is as data science moves from being exploratory type stuff.
00:23:33.030 --> 00:23:44.220
Myles Brown: into sort of a little more mature what people are trying to do is automate as much as they can. Right, so much of data science right now is exploratory and currently requires a human brain to do it.
00:23:44.940 --> 00:23:59.370
Myles Brown: There are definitely parts of data science that are very mundane repetitive tasks and those things should be able to, we should be able to automate them. And so some of the tasks that we typically automate are the basic ingestion and replication of data on a schedule.
00:24:00.420 --> 00:24:05.850
Myles Brown: And then maybe validating that data, you know, detecting things like typos or or
00:24:06.870 --> 00:24:17.520
Myles Brown: values that are missing you know basically note those identify content that doesn't match this data model that we have right so what comes to mind for me is
00:24:18.390 --> 00:24:29.010
Myles Brown: You know, you know, I use AWS, a lot. They've got a bunch of big data services for things like, Hey, let's build a pipeline that starts with sort of a
00:24:29.820 --> 00:24:36.480
Myles Brown: precondition that says look in this S3 bucket. And if there are files in there, then let's kick this off.
00:24:36.870 --> 00:24:48.270
Myles Brown: And what does it do it says okay go grab those files run through this you know process of running a series of steps and cleaning it and and category category cataloging in and all this kind of thing.
00:24:50.130 --> 00:24:54.480
Myles Brown: Along the way you might use something like AWS clue which is sort of
00:24:55.890 --> 00:24:56.670
Myles Brown: Managed
00:24:59.010 --> 00:25:06.720
Myles Brown: Catalog Data Catalog and epl service. And it's actually got the smarts to say, hey, when I went and looked in that database.
00:25:07.950 --> 00:25:12.780
Myles Brown: There's this table has a new call. What should I do with this extra column.
00:25:13.530 --> 00:25:18.420
Myles Brown: Well, let me go back to my main sort of canonical data structure and add this new column.
00:25:18.810 --> 00:25:32.640
Myles Brown: Or should I just ignore it. You know, I can put the smarts in so that when it goes and crawls and it finds all my data sources. It says, What to do when the data has changed since last time I looked at. And so we can actually build that and and automate all that
00:25:34.110 --> 00:25:43.140
Myles Brown: And once we get these models built, you know, we can we can actually automate the labeling of the data that training and validating on the models.
00:25:43.770 --> 00:25:52.800
Myles Brown: iterating study runs, you know the the actual machine learning, or we can automate parts of that. And then finally, you know, the creation of dashboards.
00:25:53.610 --> 00:26:00.330
Myles Brown: You know, if the whole thing halts and then somebody has to press a button to say go build these dashboards, you know, that's kind of stupid.
00:26:00.600 --> 00:26:05.220
Myles Brown: Right. So that's something that we can easily automate and then reporting in general, we should be able to automate
00:26:05.970 --> 00:26:19.650
Myles Brown: And so this idea of building kind of an automated data pipeline. This Forum's a lot of ideas from DevOps right that idea of a CI CD pipeline, you know, developers say hey I check code into
00:26:20.760 --> 00:26:26.130
Myles Brown: code repository, and that kicks off a bunch of steps that automatically happen that go and build
00:26:26.970 --> 00:26:36.300
Myles Brown: The whatever the artifact is I put it into a production like environment for testing. I run a series of tests and everything is good. Great.
00:26:36.660 --> 00:26:41.190
Myles Brown: We've got a product that can be you know put into production at any time now.
00:26:41.910 --> 00:26:52.470
Myles Brown: And so that idea of automating all that is very, very attractive and that's what we're starting to see on the analytic side a lot more right is is building that data pipeline.
00:26:52.950 --> 00:27:04.800
Myles Brown: Right in AWS, we actually have a tool called data pipeline that's one of the services where you can kind of stitch together all these different pieces and build that out and say let's kick this off every night at 10pm.
00:27:05.160 --> 00:27:14.970
Myles Brown: Go and do a series of batch jobs and clean up my data and everything else. And so that's that's where a lot of companies are now, right, or they're trying to get there. Right.
00:27:16.860 --> 00:27:27.090
Myles Brown: Now that automation and kind of those ideas from DevOps. This is sort of spawned a new term that you may or may not have heard of called Data ops.
00:27:28.050 --> 00:27:37.710
Myles Brown: It started to pick up some prominence. About three years ago, people started talking more about it. And then there was a DevOps story of data ops manifesto and a bunch of people signed it.
00:27:38.130 --> 00:27:52.530
Myles Brown: And we started to see that term, but really in the last year. It's become a very, you know, sexy term and it borrows from some of the concepts of DevOps, some of the ideas of the iterative and incremental, you know, kind of
00:27:53.790 --> 00:28:07.050
Myles Brown: Programming of agile concepts and that idea that we get from manufacturing, you know, a quality control idea of statistical process control where we're basically saying, hey,
00:28:07.500 --> 00:28:15.480
Myles Brown: I've got this new product, but if it, if it's not as good as the last product that I'm not going to use it. Right. And so it's really data driven
00:28:15.990 --> 00:28:25.800
Myles Brown: And so in data ops what we often see is that we've got kind of two pipelines going on. One is the what they call the value pipeline.
00:28:26.400 --> 00:28:43.020
Myles Brown: And so this is an automated process where we're basically taking data and getting value out of it. Right. So hey, everyone's got data, what I really need is insight, you know, and so I'm going to run it through a series of steps that eventually get me. Oh, that's what that data means. Yeah.
00:28:44.100 --> 00:28:53.640
Myles Brown: But it's not just a data pipeline, because it also brings in process to improve quality and reduce cycle times of data analytics.
00:28:53.940 --> 00:28:57.750
Myles Brown: Because you have some data scientists that goes and explore some data and says, hey,
00:28:58.170 --> 00:29:09.660
Myles Brown: You know, based on my findings, we should be able to do this kind of analysis, you know, hey, we got all these credit card transactions and I figured out a way to figure out, hey, are these ones fraudulent or not.
00:29:10.140 --> 00:29:19.020
Myles Brown: Right. Well, now it's time to automate that and build a product out of it so that you know every time new transactions come in, we can say is this fraud or not.
00:29:19.380 --> 00:29:23.910
Myles Brown: We flag it. Well, it looks like 80% probability. IT'S FRAUD okay flag it
00:29:24.300 --> 00:29:32.820
Myles Brown: Right. And so that's the kind of thing that you know you build that is an experimental thing we test it out. We run through all kinds of validation
00:29:33.150 --> 00:29:43.500
Myles Brown: And once we have it and we put it into production. But then we don't just say, well that's it forever, right. We always want to continuously improve right that's part of the DevOps ideas.
00:29:44.430 --> 00:29:52.410
Myles Brown: And so really, there's orchestration. There's two pipelines. There's that value pipeline that the data goes through of data moves through a process to provide value.
00:29:52.680 --> 00:30:02.790
Myles Brown: But there's also the innovation pipeline and that's this one going south to north here on the diagram where we start with an idea and then we develop it and then eventually get it in production.
00:30:03.330 --> 00:30:20.190
Myles Brown: But only if, you know, we look at the statistics and it looks like it's going to be at least as good as what we were doing before. Right. And so this is really where we're starting to see a lot of, you know, I had a discussion with a customer on
00:30:21.540 --> 00:30:29.100
Myles Brown: Thursday last week, where he basically dropped the data ops buzzword it he didn't want to
00:30:29.520 --> 00:30:37.500
Myles Brown: He said, I hate to use a popular buzzword right now, but he's like we're kind of in the data ops. And I was like, I know exactly what you mean.
00:30:37.800 --> 00:30:46.740
Myles Brown: Right. And so this is what we're starting to have those conversations with companies that are really data driven and most enterprises are not quite here.
00:30:47.130 --> 00:30:49.890
Myles Brown: They're, they're there, they built their data warehouse.
00:30:50.190 --> 00:30:56.910
Myles Brown: You know they're they're grabbing the data pulling it out of various places they might have a data lake for more ad hoc kind of stuff.
00:30:57.120 --> 00:31:08.640
Myles Brown: And they have data scientists doing ad hoc stuff, but they might not be product using it that much. Or they might not have a full like data ops kind of approach yet.
00:31:09.570 --> 00:31:20.190
Myles Brown: And so that's sort of where some of the more data driven customers we have are now having said that, we've got customers at every end of this spectrum, right.
00:31:20.490 --> 00:31:25.800
Myles Brown: There are companies that are just AI companies and they've been doing AI for 10 years right but
00:31:26.280 --> 00:31:39.810
Myles Brown: But I'm looking at you know the bulk of our enterprise customers are not at this point yet, but it's good to know where those companies are going and where you might want to be in that so
00:31:40.590 --> 00:31:50.010
Myles Brown: If, if you want to learn more. There's a lot of places to go. Right. So if you're sort of more on the Hadoop side we have both data bricks and cloud era classes.
00:31:50.400 --> 00:32:00.210
Myles Brown: You know, they start with usually kind of an intro to Hadoop and, you know, learning spark, but then they go on from there to talk about some of the other Hadoop ecosystem projects.
00:32:01.350 --> 00:32:12.600
Myles Brown: All of our cloud vendors. So here we are partnered with the big three. Typically, if you want to take a course like the big data on AWS class, you better take some, you know,
00:32:13.320 --> 00:32:22.410
Myles Brown: Previous AWS classes to learn about AWS and then you learn about their analytics Google Cloud actually does a pretty good job where
00:32:22.890 --> 00:32:36.960
Myles Brown: They've got a four day data and excuse me data engineering class. We typically take the one day, you know, Google Cloud fundamentals first just to get the basics of Google Cloud and then jump into that for day. And that's a great class. And what we're finding is that
00:32:38.070 --> 00:32:48.840
Myles Brown: Google Cloud is becoming very popular because they do a great job of analytics, you know, if you look at the big three like AWS has a quite a bit of market share, Azure has been steadily growing
00:32:49.440 --> 00:33:05.430
Myles Brown: Google Cloud is very much a third place, but if you look just at analytics and machine learning. They've got a really, really good offering and they've got a lot of market share of that part right
00:33:06.990 --> 00:33:17.910
Myles Brown: And then, you know, we have, like I said over 25 other vendors. So, you know, if you're looking for business intelligence training, you know, IBM Cognos or tablo or any of those kind of things.
00:33:18.960 --> 00:33:32.850
Myles Brown: When it comes to analytics, you've got sap. Oh yeah, SAS. You know, there's all kinds of Oracle. There's all sorts of analytics training in there. So we threw all these out. I think that Michelle was going to put these into the chat.
00:33:34.740 --> 00:33:35.880
Myles Brown: Have you done there. And my phone.
00:33:35.940 --> 00:33:36.870
Michelle :: Webinar Producer: Yeah there in the chat.
00:33:37.470 --> 00:33:44.790
Myles Brown: There, there, there, there, so you know whatever it is that you're looking for as far as classes go, you should have you covered.
00:33:45.240 --> 00:33:54.210
Myles Brown: And if if you're looking for something more specific that you don't see here you know we have some classes that we run that we actually don't even have on the website yet.
00:33:55.110 --> 00:34:10.560
Myles Brown: You know, there's some, like, hey, I want to learn Python for Data scientists, you know, we have a class like that we run it. We don't have public classes for it, but we run it in private situation. So we do both public training and private group training.
00:34:11.670 --> 00:34:18.510
Myles Brown: And, you know, under normal circumstances, we can either do it, you know, send somebody to your office or, you know, you can come into ours.
00:34:19.110 --> 00:34:26.820
Myles Brown: Or we can do it. Virtual and right now we're doing all our classes virtual but luckily, we've been doing virtual training for eight years. So we're pretty good at it.
00:34:27.270 --> 00:34:36.540
Myles Brown: Zoom is the basis of our high MVP platform, but it's it's a lot more than just that. And we have a promo right now, maybe, Michelle, you can tell us about the Promo
00:34:36.960 --> 00:34:47.040
Michelle :: Webinar Producer: Absolutely really excited to talk about our refer a friend promo. So if you take a class with us. And then after your training refer a friend.
00:34:47.280 --> 00:34:58.500
Michelle :: Webinar Producer: Or colleague, you'll get up $200 amazon gift card and your referred friend can say 15% on their course. I'm going to post the link to all of those details in the chat as well.
00:34:59.910 --> 00:35:03.540
Myles Brown: So if they sign up, they get a discount and you get a gift card.
00:35:03.750 --> 00:35:06.990
Myles Brown: That's right. Cool. That sounds. Alright.
00:35:08.310 --> 00:35:22.650
Myles Brown: So that's all I really prepared. You know, I figured, well, let's leave like 20 minutes for questions. And so that's that's where we're at. So whatever questions you have drop them into the Q AMP a and and we can cover them off.
00:35:24.330 --> 00:35:33.240
Michelle :: Webinar Producer: While those questions come in. I just want to remind everyone that we've recorded this session, and we'll send a copy out to everyone. By the end of the week.
00:35:33.540 --> 00:35:48.900
Michelle :: Webinar Producer: I'll post those trading links in the chat again in a few moments after a few questions come in and as always you can reach out to us to learn more. So go ahead, find that Q AMP. A icon at the bottom of your screen and ask any questions you might have.
00:36:33.060 --> 00:36:36.510
Michelle :: Webinar Producer: Miles. It looks like he may have just given a thorough presentation.
00:36:36.540 --> 00:36:38.760
Michelle :: Webinar Producer: covered all the questions before they're even asked
00:36:39.240 --> 00:36:53.550
Michelle :: Webinar Producer: I posted all the links to those relevant classes in the chat so you can jot them down copy them, check out our website to learn more. I also posted our promo link again one more time.
00:36:54.750 --> 00:36:58.320
Michelle :: Webinar Producer: If there are no questions. I think we can wrap up here. Thank you. So
00:36:58.320 --> 00:36:59.130
Michelle :: Webinar Producer: Much miles.
00:36:59.730 --> 00:37:03.360
Myles Brown: Rob mentioned the Azure classes.
00:37:05.400 --> 00:37:23.130
Myles Brown: So yeah, any of these. If I just click on it. This will take us to to our website. And so I just sent it to the the main level of Azure classes, but if you go down the side there is a grouping here called data and AI. And so there's sort of five main classes that as your has
00:37:24.510 --> 00:37:40.710
Myles Brown: The AI 100 is very much AI stuff. The dp is the data processing ones. And so there's the 200 which is implementing it. And as your data solution and then the tool one designing. So very often in the future classes. That's what they do, they kind of
00:37:42.360 --> 00:37:46.980
Myles Brown: They try and make a distinction between implementing and then designing so
00:37:47.580 --> 00:37:58.860
Myles Brown: It might be backwards to some people, but the implement is learn the basics of how to like launch stuff and then the designing is hey, how do I really put these pieces together. So that's more of the advanced spark.
00:38:00.300 --> 00:38:04.020
Myles Brown: And then they have like a combo designing and implementing
00:38:05.100 --> 00:38:11.700
Myles Brown: Like basically the, the whole that the two classes together so that's that's what the Azure curriculum looks like.
00:38:13.350 --> 00:38:26.460
Myles Brown: Oh, Courtney is asking a couple questions. First one, second one first Google or AWS for data analytics. I already hold AWS cloud architect. Yeah, so a lot of times what people are really interested in is having
00:38:27.480 --> 00:38:34.680
Myles Brown: Multiple certifications from different vendors. Right. And I think that's a very good idea of these days. You know, I know that.
00:38:36.390 --> 00:38:40.320
Myles Brown: You know, if I'm hiring somebody I want somebody with experience first right
00:38:40.620 --> 00:38:50.490
Myles Brown: But if I'm looking at two people that have experienced the ones that have the certification that tells me. Well, they're serious about their career. They were able to pass these exams and some of these exams are not easy.
00:38:50.820 --> 00:38:58.080
Myles Brown: Right so certification has a real value, especially when you're looking for jobs, but where it's really interesting, is where you have
00:38:58.650 --> 00:39:09.240
Myles Brown: Multiple certifications from multiple vendors. Right. And so the AWS, you know, cloud solution architect is pretty popular one. But it means you know you know the basics of AWS.
00:39:09.840 --> 00:39:17.280
Myles Brown: You can go further and get the AWS machine learning. That's a specialty certifications. You know, you gotta we have a nice class for that.
00:39:17.910 --> 00:39:25.950
Myles Brown: Assuming you know a little bit about AWS, you can take this four day class called the machine learning pipeline on AWS that'll get you pretty much ready for that specialty exam.
00:39:26.730 --> 00:39:31.380
Myles Brown: But if you want to go to a second cloud vendor. What I would suggest
00:39:32.220 --> 00:39:38.730
Myles Brown: If it's on the data and analytic side, I would say Google, you know, because that's the second cloud that people are using
00:39:39.030 --> 00:39:46.830
Myles Brown: Like we got a lot of customers who are AWS customers have been for, you know, eight years or something. Most of their infrastructures in AWS.
00:39:47.070 --> 00:39:55.170
Myles Brown: And now they say we got a lot of data sitting in S3 buckets. We're going to move it over to Google Cloud because we prefer their stack of analytics.
00:39:55.590 --> 00:40:07.770
Myles Brown: And so that would be where I would concentrate. Now, if you look at the Google Cloud side of things, you know, here we have, we have these learning paths and I'm gonna find the Google Cloud one
00:40:10.680 --> 00:40:11.490
Myles Brown: Yeah, it is.
00:40:15.060 --> 00:40:21.420
Myles Brown: So we got these PDFs to kind of show you know what the, what the pads look like. So here's the cloud data engineer.
00:40:21.540 --> 00:40:29.730
Myles Brown: You know, we typically say take one of the fundamentals classes, either the core infrastructure or, more likely, this one, the big data, machine learning. One day class.
00:40:30.090 --> 00:40:39.090
Myles Brown: Than the four day data engineering that will pretty much prepare you for the professional data engineer exam. Right. And it's, it's pretty good. I
00:40:39.600 --> 00:40:51.630
Myles Brown: I it's it's not an easy exam, by any means. And it helps if you've got a lot of big data experience so that you know when they, when they talked a little bit about, you know, the various services.
00:40:52.680 --> 00:41:03.840
Myles Brown: You know, some of them are things that you've never seen before, like Big Query is kind of a different animal than what everybody else has. But some of them are basically just do, but it's a managed service that they have for you.
00:41:04.320 --> 00:41:15.540
Myles Brown: And so this four day class kind of covers them all pretty well. And the machine learning piece as well so that's that's the best path. I would say take that one day and for day you take them in the same week that typically
00:41:15.990 --> 00:41:27.150
Myles Brown: You know scheduled together or I would you know if you're taking them virtually and then maybe we'll take the one day you know ruminate on that a while then take the four day class. And then, you know, basically you're ready to go for the engineering
00:41:28.950 --> 00:41:30.870
Myles Brown: Professional Certification
00:41:34.020 --> 00:41:38.070
Myles Brown: Now the other question was data engineering versus data analytics.
00:41:39.090 --> 00:41:41.550
Myles Brown: I think it's an interesting one, you know,
00:41:42.720 --> 00:41:55.710
Myles Brown: In Google Cloud, they make a distinction and real distinction between cloud engineer is somebody who is, you know, really in there, setting up the infrastructure and then using it a data analyst might be more of
00:41:56.340 --> 00:42:01.260
Myles Brown: You know, not deeply technical person they're more on the analytic side they might be
00:42:01.950 --> 00:42:07.740
Myles Brown: They might be able to write sequel, but they're not real programmers. They're not administrators, you know,
00:42:08.280 --> 00:42:20.760
Myles Brown: And so that's that's more, you know, it's an analyst rather than somebody who's actually, you know, launching stuff and running things they might be just connecting to, you know, Big Query and running sequel queries.
00:42:24.030 --> 00:42:30.270
Myles Brown: So that's a couple good questions there. If you have any other questions, I'll throw my
00:42:31.380 --> 00:42:32.790
Myles Brown: Email address
00:42:33.930 --> 00:42:39.510
Myles Brown: Here smiles dog brown and protect data calm.
00:42:42.120 --> 00:42:51.360
Myles Brown: Feel free to send me any questions if it's questions about a specific class, you know, I might eventually for john doe sales person to help you out. But otherwise,
00:42:52.770 --> 00:42:56.190
Myles Brown: You know, I can, I can answer a lot of the basic questions around that stuff.
00:42:57.360 --> 00:43:03.000
Myles Brown: So I think, oh wait just a couple minutes and see if there's any other questions coming in.
00:43:17.760 --> 00:43:19.020
Myles Brown: Kept it in under an hour.
00:43:20.250 --> 00:43:20.760
Myles Brown: Good stuff.
00:43:22.230 --> 00:43:28.950
Michelle :: Webinar Producer: Thank you everyone for taking some time out of your day. You know where to reach us if you do have any questions. As always, thank you. Miles.
00:43:30.600 --> 00:43:32.130
Myles Brown: Excellent. You stop recording