Hello, everyone. Welcome to today's webinar, Understanding Serverless and Functions as a Service. This is the third and final webinar in a series presented by Randy Abernethy, managing partner at RX-M, a leading cloud native consulting firm based in San Francisco. Before Randy gets started, I just have a couple of housekeeping items to go over. During this webinar, all of your phones are on mute. So if you have any questions, please enter them in the chat box on the left. We're going to answer all questions at the end. Today's webinar is being recorded, and we're going to be sending a link out to all of our registrants. Now, let's get started, Randy, whenever you're ready.
All right, great. Thank you very much. Welcome everybody. This is the Serverless Systems webinar, and got a lot of interesting stuff to cover in the next hour. This seminar is presented by an ExitCertified Tech Data and also RX-M. RX-M's a cloud native consulting firm. We work with ExitCertified and Tech Data to build unique kind of training programs focused around cloud native, particularly microservice-oriented, container-packaged, and dynamically managed being sort of the definition of cloud native. But serverless has become a very interesting and rapidly growing component of what is offensively still the cloud native space. Certainly, serverless is a cloud native technology. But it has some very different characteristics from more the traditional microservice deployments that people would associate with maybe Docker and Kubernetes and tools like that.
So in this seminar, we're really going to focus on what serverless is all about, some of the tools and technologies that have cropped up in the serverless space, and what sort of initiatives are ongoing in the open source space to kind of promote sort of interoperability and things of that nature. So we can begin just by picking up with what is serverless. So pretty clearly, it's a technology where you don't have servers.
Now, obviously, there has to be a server. There's a place where the serverless execution is taking place. But the key point about serverless is that you have logic on demand. So you have the ability to create a function, typically, deploy it, and then have it execute. You don't provision servers. You don't back them up. You don't maintain them. You don't worry about scaling. You just provide your function, and then you typically wire it up to some sort of an event.
So once you've got this ability to create functions and deploy them in some cloud-like platform, you have basically a very, very high level of abstraction to work with, and you can build entire applications this way and extend them with these little bite-size bits of business logic.
Now, like any new technology or any new movement, it sounds great, and there's this wonderful sort of a spree decor that sweeps through the industry when the technology is introduced. But then as people start using it at scale, they find themselves grappling with a lot of the challenges. So I would stress that serverless is sort of where cloud native microservice-oriented types of platforms were maybe a couple of years ago.
Amazon has Lambda, and they've been offering that for years now. But people are using Lambda in very specific ways, as we'll take a look at. Other technologies that are cropping up and tools for all of the types of things that you need to have to use these things in Ernest like debugging and monitoring and managing large scale sets of functions is all pretty new.
So we're seeing a lot of energy there and a lot of exciting things happening, but it's early days. One of the most exciting things about functions as a service is the billing, because the billing is organized by function. So you could, in a lot of these clouds, create 50, 60 functions and pay nothing until someone uses them. Then as people use them, you pay for use. So BMs are sort of where this revelation, right, in the cloud, where we got self-service and pay as you go. So we have the ability to click a button, and boom we've got a virtual machine. But you are paying for all of that quad-core, 16-gig virtual machine, whether you use it all or not. If nobody's touching that machine between the hours of 1:00 AM and 5:00 AM, you're still paying for it.
So with the functions as a service, you're really just paying for them when you're using them. The whole idea behind serverless is you don't pay for the server. So that's a pretty fantastic aspect of it, and it can be a game-changer from a a price performance standpoint for certain types of applications.
Now, another thing to think about is this sort of thing could also kind of be reflected back to containers. If you think about virtual machines and paying for virtual machines and then compare that to container technology. If I want to run a Kubernetes cluster, I have to stand up virtual machines, which is sort of odd since I don't want to run virtual machines. I want to run containers.
So this ability to basically just pay for containers would be sort of an interesting thing, and many folks at my shop used think that this... It was just a matter of time before some cloud vendor was going to start charging for containers, pay for container utilization, sort of like Heroku does with Dynos and things like that on the past side, carrying that over into containers as a service and providing a similar billing model.
But curiously, Amazon is usually a leader and coming out with technologies before anybody else still doesn't have such a thing, but Microsoft does. So Azure has really been just making huge strides and is now by most measures the number two cloud platform, and they were one of the first ones to come out with pay for the container rather than the VM.
So we're seeing billing models change as the level of abstraction rises, and the billing models are changing to match the ways that people actually want to use these systems. So interestingly, everybody who's offering functions as a service or a serverless type of environment is pretty much billing for the function usage rather than the virtual machine. So that's kind of a pretty exciting piece of the puzzle, at least for CFOs, and CIOs who are trying to manage costs. There's some interesting things that you can certainly do there if you design things right.
Then the final point here is that serverless approaches can basically give you the ability to completely scale out to massive size. How well is this done? That's a very interesting question, right? If I have a function that I've uploaded to, say Amazon, and 15 people are calling that function over the internet, and then all of a sudden it's 15,000 or 1,500,000, what happens? Is it going to scale instantly? Is it going to take a while? Those are questions that you can't answer in the vacuum, you have to talk about a specific platform and how they work.
Because all of this is so new, there's a lot of variation. So the responsiveness of, of the abilities of these systems to scale out varies from vendor to vendor. So we'll take a look at each of the vendors and talk about some of the pros and cons of each of the offerings that they provide. So here's a good visual model of a serverless solution that you could deploy on Amazon. This is a straight-out-of-the-Amazon playbook, right?
So if you go to the Amazon website and look up Serverless Lambda, that sort of thing, you're going to see a model, something like this. So little bit of terminology before we dig in. Serverless means you don't pay for the server, right, or you don't manage the server. It's being handled for you by the platform. But we also sometimes call the main serverless facility that we have, which has the ability have functions just run functions as a service.
So you might hear FAS or something like that. The idea is that we can just push functional code up to the cloud and then just have it run based on events. Well, there are a lot of possibilities for triggering these functions, and it doesn't have to be through a sort of a restful style API gateway. But in this example, that's exactly what's happening, and that is a common use case.
So imagine that you want to run a website. You want to have a dynamic website that customers can use, but you don't want to pay for servers. You don't have any idea what the utilization is going to be. Maybe it's going to be very, very spiky. You're going to need huge scale on certain times a day, on certain weeks, and then very, very small scale at the times, and you just sort of want to get rid of the whole worry of managing VMs and being billed for VMs. When you're using VMs, you always have too few or too many, right? It's almost never the case that you have the exact right amount.
Of course, you can autoscale VMs and things like that. But booting up a VM takes a while, right? If you want to be able to scale in a very responsive way, you kind of need a different structure. So we could build a website such that our clients fire up a browser and then use a URL that takes them to Amazon S3. Amazon has three simple storage, is basically a bucket-oriented kind of object store, and all of the objects that it stores are identified by URLs. If we make that URL public, anybody with a browser can load that URL.
Well, on S3, we can put HTML files. We could put CSS. We could put JavaScripts, and all of the things that we need to supply our website. So basically, it's a web serverless website more or less. But then what about the dynamic part of it, right? What if this is a single page web app, and we want to do a bunch of XHR to do dynamic retrievals from databases and be able to update things and fun stuff like that. That's usually where you need to get into running a VM with a web server and then install a CGI gateway with some stuff or use PHP or rails or something to get those interactive requests processed and sent back out.
Well, with serverless, we can skip all that. With serverless, we can, for example, use Amazon's API gateway. Their API gateway allows us to create a URL. So let's say, for example, we're working with the SPCA, and we're trying to help people adopt dogs that we've picked up off the street or what have you. So we might have a route dogs, and when you get that route dogs, it's going to give you a list of all the dogs.
So we could do is we could go into the gateway and set up this route dogs, and we could say when somebody, the get verb on that route in focus function. So we could build a Lambda function then that just goes to Amazon's DynamoDB, pulls up all of the dogs that we have that have not been adopted, format some in a JSON or something like that and sends it on back.
Then our XHR routine and the browser could display all that stuff in a nice chart. So in this model, we don't have a single VM. It's just services that we're using, and the place where the business logic lives is right here, right? The front end business logic lives in these static pages that we're loading to the browser, and the backend business logic lives right here. Everything else is glue, right?
When you think about this gateway, this is the guy that's generating events, right? So events are really created by a user in this case, but they flow into this gateway, and the gateway creates an event that fires off that function. Since this function is running in Amazon's infrastructure, so that function is really running on a VM in EC2 in Amazon's world. It has access to all the Amazon services. It could do big data stuff. It can hit a no SQL databases or SQL databases, whatever you'd like.
You could wire up services from Amazon, like Cognito that takes care of user authentication and user account management, things like that. There's all sorts of interesting ways that you can wire up these services. So if we wanted to take a quick tour of the history of functions as service or serverless, it probably started in 2006 with a little company called Zimki, which was basically a spinoff of Canon Europe. The idea that they had was there's people that are building lots of code in the browser, and when they want to make XHR request to the back end, wouldn't it be cool if you could just provide the JavaScript code that you wanted to have those XHR functions call?
So they built a platform as a service. It was really a path. But where everybody else kind of at that time was building passes for Ruby on Rails or some other things maybe frameworks for Python or something like that, these were the first guys that were thinking, "We could do this with JavaScript, and it could be JavaScript all across the board. Everything's event-driven in JavaScript. So why don't we just let people upload individual discreet functions, and then we can invoke them directly?"
So whether it was knowingly or unknowingly, they sort of invented this whole space maybe before its time actually. Then they had a little bit of a shutdown, and there was some disturbance at [inaudible 00:14:50] and that kind of ended them sort of the next year. So it's sad because they were onto something pretty interesting probably, and I think at the time, their platform was growing, and they were getting adoption, and it was looking like a success, but it just didn't seem like it was a strategic thing for Canon, so they closed it.
But it took seven years before somebody else did something similar, and it was Amazon introducing Lambda. So the Amazon Lambda platform gave us a really large scale cloud vendor offering functions as a service where we could upload Node.js, not surprisingly their first option being an event driven kind of platform. Then they added Python. So Java, C#, kind of all the big popular languages. They've got the ability through Node.js. You can create a Node.js build pack for their Lambda functions. In Node.js, you could invoke a C++ program or something like that. It's a little hokey, but you can do it.
So another thing that I always thought was curious is that they haven't provided a containerized version of their Lambda offering just yet. If you are familiar with Amazon, they have a path called Elastic Beanstalk. Elastic Beanstalk originally had build packs, right, where, "Hey, we have got C# build pack." So you can upload your C# code as long as it's using .NET version this and these libraries and that sort of thing. If you want to use a different version or different libraries, then sorry, we don't support that. If you want to use a language that we don't support, like rlang, sorry, you can't use our past.
So that was kind of a big problem with all of the passes that were generic, like Elastic Beanstalk, like even Google's App Engine. Where Heroku hit a home run was they realized Ruby on Rails was a phenomenon, and they could easily provide a hosted rail solution with all of the platform stuff, like databases and message queuing and everything just provided, web servers, and people can just push up their code and use gems to bring in their libraries and the few different rails build packs that could cover most of the turf, and a lot of people would be interested in it. That's exactly what happened. Then Django for the Python side of it and things like that were the next steps express on Node.js.
So here, it was sort of like the same thing happening all over again with this whole build pack thing. We don't have the flexibility that you get with containers, where I can put anything in a container. If you can support containers, it could be rlang or OCaml or D or whatever. But when we're using buildpacks, like Lambda does, we're sort of stuck with the languages that they support.
But if you look at the date, Docker was nowheresville in 2014. There was no Docker Hub up until the summer there. So it was pretty early for maybe kind of trying to adopt containers. So maybe we can forgive them for that. Google Cloud Functions shows up two years later. So it takes two years for the next cloud vendor to show up. So even though Amazon is the 800-pound gorilla in this space, they still are very, very inventive.
Then interestingly, everybody else sort of kind of came in line right then and there. You had Azure Functions. By November of that year, you had IBM OpenWhisk. IBM OpenWhisk was a really neat product because OpenWhisk did support containers right out of the gate. So you can actually upload a container with your function in it to open whisk, and then whatever's in that container can be invoked by the events that you wire up to it. That is now Apache OpenWhisk.
So that is an open source project. It's Apache hosted. Obviously, IBM is still a big contributor. They created it originally, but there's lots of people contributing and using it. Because IBM's infrastructure cloud is OpenStack, this works swimmingly on OpenStack. But you can run OpenWhisk standalone if you want it to. If you choose to, of course, run your own function as a service platform, you sort of missed the benefit of not having to pay for VMs, because obviously, you have to have some computers to run OpenWhisk on.
So where you're using it as a service, if you use IBM OpenWhisk on the IBM cloud, which used to be called Bluemix you just, of course, don't have to pay for the VMs. You just use OpenWhisk directly. But if you're going to deploy it on your own in your own data center or on your own OpenStack cloud, you're of course going to have to provision some machines to handle it, and those machines, the number that you provision is going to limit the number of functions that can ultimately be run.
So there's some really nice synergies between functions as a service and public cloud, because when we think of public cloud, we sort of think of almost unbounded scalability, right, and that's sort of nice for this whole function thing. Another thing that cropped up here in 2016 was the first Serverless Framework called the Serverless Framework. And the idea behind that was, Hey if I build stuff for Lambda, and then all of a sudden Google offers this amazing pricing, it's not easy to switch because the event integration and the way that you hook your functions into the interface of the cloud vendor is different. So you're going to have to do a reporting job to get things to move over, and some cloud vendors support some events and other cloud vendors don't.
So the idea behind the Serverless Framework was to give you a development environment that you could build serverless applications or serverless services with, then be able to push them to any cloud backend that you prefer. So that was kind of a really nice ad, and we've seen more of that. Then fast forward to 2017, now we're in a world where cloud native is becoming just this massive wave hitting enterprise, and everybody's getting very, very interested in the possibilities with cloud native because of two things, scalability.
So the biggest companies in the world, scale-wise, Twitter, Google, Netflix, what have you, they're all embracing microservices, container packaging, dynamic orchestration to get the scale that they need. But you also have these firms really being excited about the rapid innovation that the microservice kind of approach brings to the table. So we have the ability to atomically deploy individual services. So teams are sort of liberated from big integration and gridlock with lots of kind of communications overhead and things like that.
You have teams being able to operate at their natural pace to innovate, and also, you're decomposing the application by business functions. So one could argue that these things sort of carry over to the function as a service model in many ways if you want to do it right and if you want to get all of the benefits from it. But also, because these cloud native platforms are really getting huge traction, whether it's Kubernetes or Kubernetes on Mesos or whether or not Docker has just announced support for Kubernetes on Docker Enterprise. So you've got all these platforms supporting Kubernetes. Wouldn't it be nice if we could run functions on Kubernetes?
Of course, OpenFaaS shows up in January of 2017 as the first Kubernetes function as a service framework to run functions directly on Kubernetes. So whichever platform you choose and whichever environment you are interested in working with, there's a function as a service option growing. This is, by the way, the Cloud Native Computing Foundation landscape, and it's updated pretty regularly. This is 9.9, I guess. I think there's probably 9.11 or something out. You can find it on the web right here. It's on GitHub and freely available. It's a really cool way to kind of get your head around all of the different layers, the infrastructure at the bottom, the provisioning systems that can dynamically provision that infrastructure, the runtimes, whether it's the container piece of the puzzle, the networking piece of the puzzle or the storage piece of the puzzle and then on up through orchestration and app deployment, full vertical platforms and so on.
Well, this box right here is the function as a service box. That's the serverless box. I mean, look at that. It's not like one or two logos in there. It's a pretty active space. A lot of things have cropped up. So the Cloud Native Computing Foundation, it's a nonprofit under the Linux Foundation. It hosts some of the most important products or open source projects on the internet for just doing cloud native stuff, like Kubernetes, for example, and others.
But there's also a working group that is focused on cloud native and its relationship to functions as a service or serverless. So in regards to that, we of course in the landscape had this serverless event based kind of section. You'll find logos here that are public cloud related. So PubNub blocks sort of a published subscribed fabric for connecting things together with functions. You've got IBM Cloud Functions, which is OpenWhisk. You've got Google Cloud Functions, Azure and Lambda. Then you've got some database-centric ones.
So Oracle has APEX, Application Express, where you can kind of do API event integration with Oracle databases and build full applications just on top of the database. Then you've got Firebase, which has been around for a while, sort of a cloud hosted database that you can easily integrate applications with, and so this works great as a backend, but also has now an event API that you can use with Google Cloud Functions directly.
Then you've got frameworks and IDEs showing up. So not to be confused with this Apex. There's another apex, which is a dev front end for AWS Lambda. There's serverless, which is a multi backend development environment and then Webtask, which is sort of an IDE that's actually hosted in the cloud, where you just build your functions right in the IDE, and then they're available, and you can run them.
Ope source platforms. There's a bunch of these also. We mentioned OpenWhisk and OpenFaaS, but there's also NStack and Kubeless and Fission. Most of these are targeting Kubernetes, and some though are for analytics, actually. NStack is a cross-cloud analytics platform. So a lot of interesting things happening here. Given how new this is, you can see that it's a very hot topic. A lot of people are asking about it, and vendors want to have an answer. So when that happens, sometimes their answers are a little premature or optimistic, I will just say, and we'll leave it at that.
So AWS Lambda is the one that's been around the longest. So they've got a lot of kind of architectural models in the Lambda side of things that just sort of integrate applications from end to end. So this is just another way to look at that app that we talked about earlier, where you've got a bunch of HTML and CSS in S3 and browsers can just grab that directly, and then the JavaScript that's baked in there to make XHR requests dynamically to the gateway, and the gateway can hit Lambda functions, which can go to DynamoDB or a relational database or something like that.
But we can build much more complicated things as well. To that end, Amazon is trying to make sure simple and easy to get started with. So they've got over 80 land of blueprints. So all these blueprints, when you go in to create a Lambda, you go into the Amazon Console, go to services, hit Lambda, build a new function, and then you can select a blueprint if you like. So they've got blueprints that do batch jobs with Python, that grab objects out of S3. Different programming languages are exemplified, stuff that's working with Kinesis, and all sorts of things.
So there's 80 blueprints, right? There's a lot of different blueprints, 83, actually in a snapshot, a lot of different blueprints that you can use to get started. You click that group blueprint. It just deploys the function right out of the gate with all the event wiring and everything, and that's not going to work, but all you need to do is change some of the obvious variables to wire it into your existing platform services and change the code to do something custom for you, and you're good to go.
So what do we use to trigger a function in the cloud? Lots of things, it turns out, depending on the vendor, right? So every vendor is different. So unfortunately, there is no standard, right? We don't have any stock integration points we can always count on. But there are some pretty common ones. The most common ones are gateway, right? Everybody has an API gateway.
So if you're a public cloud vendor, you're going to have an API gateway, and you're going to make it possible for people to map rest routes and verbs onto Lambda functions so that you can build dynamic web based applications or APIs independently of any VMs. So that's one that you're going to pretty much find everywhere.
Another one that you're going to find just about anywhere is something that wires sort of queue based messages to a Lambda. You can imagine if I have a message queue and people are sending messages to this queue. It might be really nice if I just have a function that gets triggered every time a new message drops into that queue. So in the case of Amazon, that's Kinesis Streams, which is basically Kafka, so you can be monitoring a Kafka topic. You also have the simple notification service. So that is a way that you can actually send messages out to the outside world through SMS or something like that, email, same type of thing. You can easily send emails out, but you can also handle [inaudible 00:28:58] functions invoked when emails come in.
You have also Simple Queue Service. I'm not seeing it here right away, but I'm sure it's in there, so SQS, Amazon Simple Queue Service. You can wire up to. You can also wire up to S3 events, which is really kind of neat. Imagine that you built a functions and its job is to convert MOV files into MPEG files. So you could have a bucket in S3, and anytime somebody uploads an MOV, this function could get triggered. So the function gets triggered, and in the event data that it receives, it will have the name of the file or the URL.
So it'll grab that file, do it's transcoding, and then it might have another bucket where it drops the file. As the file drop there, it could use the simple notification service to send you an SMS or use the email service to send you an email, "Hey, here's your file. Come get it." So really lots of stuff, cloud formation. You're going to have functions that run after cloud formation events when you're standing up infrastructure. You've got cloud watch logs and events for monitoring of your platform. You can have functions invoked after certain situations occur there. You can schedule things.
So you can have things happen at a certain time of day, batch oriented types of things. These can create all sorts of huge cascades. Another one that a lot of people are interested on the consumer side is Amazon Echo. So you can you know wire Alexa functionality into these functions, Alexa, call function F, type of a thing, and have these functions perform the backend services that are necessary.
So there's a lot of possibilities, as you can see. One of the most important things about these functions is the events that are available to trigger them. There's really two pieces of this puzzle. There's running the functions and how that works under the covers, which is sort of a little bit opaque, and there's the events that you can use to trigger them, right? Those are the two things that, as a developer, you probably really care about the most. How can I make my function run? How can I wire it up to the events in my application domain that I want to respond to. Then in fact, is it going to be running in a performance way? What's the cost and all the operational things of running the function?
So the support on Amazon, we'll look at a few different cloud vendors, but we'll start with Amazon since they've been around the longest and probably have the most adoption. The way that you can get your functions into Amazon is threefold. You can either just dump some raw code in there. So you can literally go into the console or use the command line tool and just dump some code into the editor, and that's it, or you can upload a zip. So you can upload a zip with the SDK or with some buttons on the console, and then you can also put your code in S3.
So whatever works for you, you can do. S3 is kind of convenient. But if you've got a build pipeline that can automatically drop a zip, that works fine as well. In Amazon at the moment, Lambdas can be coded in C#, Node, Java, or Python. That's sort of the downside if you asked me. I mean, there's no containers, right? These are your options, as of... I don't know. This is probably a screenshot maybe a month ago. So it's possible they've added something else.
But you can see they only got particular versions of Node and particular versions of Java. So if you wanted to build your stuff with Java 9 today using the pre-release stuff, you can't. If you wanted to use Java 7, you can't. Those are the build packs that they've got. So we're sort of kicked back to the days of build packs and the platform as a service world when we're using AWS plan.
But the one interesting thing is that there's tricks and hooks, and they're pretty easy to find on the internet that you can use Node to launch apps built in other languages and things like that. But it still kind of takes away some of the elegance of this static runtime that you get from a container with environment variables being baked in, command line arguments being baked in and all that sort of stuff.
Sure, you can work around those types of things. But there's some restrictions to tapping use build packs. So that's the basics of getting the code up there and what kind of code you can use. The next thing that we need to specify is the kind of events that we're going to hook into and which services you might want to use. So in the case of a Lambda, you've got your function, name, and description, the runtime that you're going to use, and then you've got your code, and your code can do whatever it really wants.
But in the case of AWS, we get this standard handler invocation right here. So the handler is going to get an event, a context and potentially provide a call back. So you can check to see what kind of operations might be invoked, and that's defined as your event operation. You can look at the event payload, and what do these things mean? What is the event operation? What is the event payload?
Well, that's a darn good question, and it depends on which cloud you're in, and it depends on which events you're wired up to, what that is going to look like. So while Amazon has a pretty good handle on how they want it to work and how they use this interface to wire you into anything from an SQS message landing and a queue to someone dropping a file in S3 to an API gateway method being called. So they're all very different, but the same API is used, the same mechanism for invocation is used regardless.
Now, another thing that you might note is that this body of code allows you to put code outside of the handler, right? So you might have some setup code. Well, this is another curious thing, right? The setup code runs once. Then the handler can be executed over and over and over again. So under the covers, what's happening is these cloud vendors or platforms are basically saying, "Okay. Nobody's using your Lambda. So I'm not going to run any." But I have maybe a box ready to go in case, "Oh, someone's using your Lambda."
I'm going to create a container, right, under the covers, of course. I'm going to run your startup code, and then I'm going to invoke your handler. Now, the second time somebody calls it, I've got a warm container on a warm VM ready to go. So the latency between those two scenarios is going to vary widely, right, from, "Hey, the startup code has to run." And then I invoke the handler, and potentially, I might even need to worst case spin up a VM.
But from that all the way to the VMs running, the containers running, the startup code is already executed, and I'm just going to invoke the handler. As you scale, you can imagine more containers get created, more VMs get created to run more containers. Are these dedicated, or are they multitenant, right? Those are questions, again, that each individual cloud vendor has to answer for themselves. Some of them are configurable. Some of them are. Some of them, you are going to run yourself, and you decide. So there's lots of possibilities here for how this stuff actually gets invoked under the hood.
Big data is another environment where people are pretty excited about Lambdas, because big data people are really super focused on the analytics. They don't care so much about the platform or the tools beyond wanting to use them. So setting up all the infrastructure and Hadoop and Spark and all these things is not the thing that the big data or the data scientists are keen on. They're keen on doing analytics.
Well, if you start thinking about functions as a service, you're dealing with a very high level of abstraction now because you're just writing the analytic code in that function, and you can sort of forget about everything else. So there's a lot of things to like about functions as a service or serverless in a big data context.
So you can imagine a scenario where maybe somebody posts some data to a lager end point through an API gateway, and then that invokes a Lambda function. Maybe that Lambda function does some formatting, tweak some stuff, and then writes that data to a Kinesis Stream. So that Kinesis Stream Kafka topic essentially could be monitored by anybody, including another Lambda function, which then could write that data into DynamoDB, and then WebEx could immediately see that data, that sort of thing.
Don't forget that putting something into a Kinesis Stream topic could kick off 17 different Lambda functions in parallel if it needed to. Then you've got things like DynamoDB emitting events, right? We saw all those different things in Amazon [inaudible 00:38:16] events. So you could have any new record, any new aggregate element being dropped into a DynamoDB database, new JSON block could cause an event, and we could have another Lambda that triggers off of that, and that guy might dump some stuff into fire hose, which then sends it out into an S3 bucket, which then you can do Elastic MapReduce on or Redshift, or what have you.
So very, very easy to thread together really sophisticated pipelines without a single server. There's not a single server in that map right, in that model, and that's another one straight out of the Amazon playbook. So they've got a bunch, and it's really interesting to kind of look through some of the models that they've created. So switching gears and looking at some of the other cloud platforms. There are Google Cloud Functions as well, and Google Cloud Functions were one of the quick follow-ons to Lambda two years later, but one of the first ones, and you can see it's a very similar thing, right?
You go into their console, or you can use their G Cloud command line guy, if you want to. But you're going to create a function. You're going to give it a name, and you specify a region that it's going to run in and how much memory it's going to get. In the case of billing, that often matters, right? How much memory your Lambda uses is going to matter, because if your Lambda is a big data Lambda, you could imagine it could use a whole lot of memory, and that's going to restrict the number of Lambdas that Amazon or whoever is going to be able to run on a given VM.
So that's going to increase their costs. So this is often one of the cost factors and something that you may need to specify a limit on or pay for if you don't have a limit. Timeouts. Also, this is a problem, right? What if your function gets invoked and it runs for seven hours? Not because it runs for seven hours, but because it's broken, right? They are charging you for time, right? They charge you for events. Yes. My Lambda is fire. So you pay a little bit for that typically. Depends on the vendor with typically.
You're also going to pay for CPU time. So if you run for seven hours when you only should have run for three seconds, that's going to be bad. So you might want to set a time out. Then you've got your triggers. So obviously, a much smaller set of triggers available here on the Google Cloud side. Again, this was maybe a snapshot from four months ago or something, but still, they're playing catch-up.
You've got topics that you could specify if it's a pub sub message that you're going to fire an event on, and of course, they have their HTTP gateway stuff. How are you going to get your source code up there? You can edit it in line. You can do a zip. Zip can come out of their cloud storage, or you can use a cloud source repo if you want to using their source code management stuff. So that's an example, and it's pretty similar to what you probably noticed on the Amazon side with a few more details.
So what do the economics look like? Well, this is just an example, right? Yeah. You have to kind of look at the cloud that you're dealing with. But there's a free tier in a lot of these. Jeff Bezos's mentality is to dominate the space, and he seems to be historically pretty willing to slash pricing, in fact, cannibalize himself in order to stay the dominant player. This is the innovator's dilemma that a lot of companies have a hard time with. Amazon seems to be pretty good at it.
So Lambdas are for a lot of things cheaper than spinning up VMs and doing the same work. They knew it when they set the pricing, but they wanted to get people moved over to that because it would be sticky because they had a two-year lead, and that is exactly what happened. But if you look at all of these cloud vendors, they all provide you a free tier so that you can try it and kind of get hooked on it if it's going to work really well for you, and maybe the cloud vendors aren't so excited about us figuring out interoperable way to run functions on any cloud without the stickiness.
But that sort of seems to be where things are going. So anyway, from a pricing standpoint, invocations, you can see here above the free limit, it's 40 cents per million invocations. So that's the event side of it, right? You can pay 40 cents for a million events that cause a function to get called. Then compute time is a fraction of a cent per gigabyte seconds. Or excuse me, this should be gigabyte seconds. Right. Yeah.
So this is one second of wall clock time with one gig of memory provision. So that's the memory side of it, and then this is the CPU side of it. So this is really memory seconds, and this is CPU seconds. Then you also are going to pay for outbound data. So if you send data outside of the cloud, they're going to charge you for that. If data is coming into the cloud, sometimes you get charged for that. Sometimes you don't. Here you're allowed to pass any amount of data that you'd like into the Lambda.
Then we've got of course, outbound data to Google API in the same region. Usually, clouds will charge you for moving data between regions, because that causes them to have to use their wide area network, and a lot of times, they'll Mark that up. But in this case again, on the G Cloud side, it's free. So this isn't meant to be the global pricing that you should expect everywhere. This is just an example so that you can start thinking about the types of things that you're going to pay for.
So it's really a good idea to do some experimentation and to look closely at what you're paying for and how you're paying for it and then to ask yourself, is this a good economic trade? It might be that you'd pay more to use cloud functions than you would to run stuff on VMs. But it also might be that the productivity bump is well worth it. So you just have to weigh all the factors when you're looking at these things.
As your functions, we've got triggers here as well. There's cron and platform events, and they have a ton of them, and then of course the HTTP servicing end points. As you can see though, Azure has got a ton of different languages. You've got JavaScript, of course, C#, of course, and F#, Python, PHP, Bash. So you can just write Bash Shell Scripts, Batch, the standard Windows kind of Batch Shell stuff, Power Shell. Any executable will work.
But again, no containers, sadly... But it does integrate with visual studio services, GitHub, Bitbucket. One of the cool things about Azure in general is that you can run Azure in your own data center. So you can use the Azure cloud, but then you can run Azure cloud in your own data center, right? It's not like Amazon where you can use Amazon or you can use Amazon. It's actually possible to run an Azure cloud in your own data center, and you can run Azure WebJob SDK stuff in your own data center as well.
So they've got a complete repo with the whole Azure function stuff in an environment that allows you to run it locally. So you can use it locally and then have the same exact platform on Azure to run those functions there as well. All right. So OpenWhisk, we mentioned OpenWhisk a little bit. It's Apache OpenWhisk. It's hosted on what used to be Bluemix, what's now IBM cloud, and it also runs on OpenStack, and you can run it on AWS if you want to, and you can run it on Vagrant. You can really run it just about anywhere. Anywhere there's a computer that runs Linux, you can run it.
But there's installers that specifically work with these different environments. So it's open source, easy to stand up, easy to use. The cool thing is it supports containers. So Bluemix in my opinion got it right from the standpoint of the code model that you want to support. Serverless, just a quick note on these guys. It's an MPN based install that you can then create application templates from, and then you can deploy them to different backends. They support AWS, Azure, Google Cloud Platform, and OpenWhisk.
So a nice option for people who want to build things independently of a given backend. Now, of course in order to make your code abstract from the back end, you have to use the intersection of features from all the vendors that you want to target, which may limit you. But it's a neat project, and it's been growing pretty rapidly. They originally just supported AWS, and that's the one that they support the best. They've more recently added the actual implementations for these guys. I'm not sure if they're still in beta or they're actually GA or not yet.
So what are the use cases? There's a bunch. Mobile backends, right? Most of your business logic is in the front end, and you just need state management and things like that and then a little bit of glue code, mobile backend with some functions as a service could be a great option. APIs and microservices. You can build entire microservices out of these serverless functions. You just say, "Okay. I have these three RBC calls that are part of my service interface or these three rest calls that are part of my service interface and implement them with separate functions."
You can also, as you can see from some of the examples that we looked at, do data processing, of course, extract transform load types of things, work really nicely with these functions and pipelines, web hooks, and they're great also for IoT. A lot of times, you'll have IoT devices admitting messages, and you might need big scale for receiving those messages and then doing some pre-processing or cumulative batching of those messages. Filtering, windowing, things like that, all can be easily done in functions as a service environment.
There are some conferences happening. So you'll see that there's 127 meetups for serverless out there with 30,000 people in the membership list. So that's the URL down there at the bottom. You've also got AWS re:Invent. If you go to the AWS re:Invent website, and you just do a search on sessions for Lambda, you'll see that there's something like 15 serverless Lambda talks, and then there's another like 10 workshops, and then there's another 20 things related to it. They actually have a mini-conference for serverless at AWS re:Invent.
Qcon has a lot of stuff in the space, and then there's this Serverlessconf.io. They had a Hells Kitchen conference in October just recently. So it's an active space, and you see a lot of this at the CNCF con, which is also coming up in December right after re:Invent. And the CNCF, for that matter, has a serverless working group.
So if you want to track things like, what is this? What are the boundaries of serverless, right? Clarity for ourselves and consumers is what the serverless working group is working on. Common terminology, right? So we can all call things the same thing. What is the scope of the space as it exists today, and where's it going? Common use cases and patterns and where this all fits with paths and container orchestration and how it relates to the cloud native space.
So these are all public, right? The cloud native compute foundation is an open source public multi-vendor, very, very open type of environment, and you can go to the GitHub website and look at all the different working groups, but wg-serverless is the one where the proposals and the write-ups and the white papers and things like that that are coming out of this group of pretty smart people are landing.
So not a bad place to monitor if you're keen on watching what's happening in this space. One of the things that that team has put together is a basic model, and it looks simple, but there's some really important things here. Here's your function, and there's an event API, right? The most important thing that we need for portability is this glue, right? How does the invoker, whatever it is, call us? What are the things it's going to pass us, and how do we get the information that we want, right? If that were standardized, we'd have a really portable environment.
So we need to understand the different events that can be generated and how we're going to receive them. Then you've got internal events coming from the platform and external events coming from messages, from Kafka, HTTP, or what have you. Then if we look at the different possibilities here, there are some standards out there, one of the standards, but there are some initiatives, like open events, and then there's Cloud Auditing Data Federation from the Digital Management Task Force.
So there's a few different things that have been contemplated when it comes to different event standards. So you can see that the working group is coming up with kind of different invocation models and thinking about different ways that these functions as a service could fork and join and permute as they're used in different patterns. So I know we're getting kind of close to the wire here. So I think what we'll do is just kind of leave it at that and maybe take a few minutes here at the end to see if there are any questions.
Thank you, Randy. If you do have questions, please just remember to enter those into that chat box on the left. We don't currently have any right now. So I'll also just remind everybody that we will be sending this recording out to you next week. It looks like we don't have anybody chatting in either right now. So thank you everybody for attending. This concludes today's webinar, and you may now disconnect. (silence)-