Data Warehouse vs. Data Lake: What’s the Difference?
Learn the difference between data warehouses and data lakes, and how businesses use them.
Virtual machines (VMs) and containers are both virtual environments that use software instead of a physical machine to isolate applications into a self-contained unit. But VMs and containers are built and run differently, which affects their capabilities and the costs of using them.
To understand VMs, let’s start with a basic understanding of servers. They’re called servers because they serve up applications, information, or other services to other computers. VMs are actually virtual servers in that they serve up an application to other computers. VMs are deployed on a physical computer or server, called the host. You could run a VM on your personal computer, but companies, which run dozens or hundreds of VMs, typically house their VMs on a physical server.
Before hosting company applications in the cloud, companies housed them on physical servers, which were stored in racks in a company’s data center. In contrast to commercial applications like Word and Excel, company applications are created by companies to allow employees and customers to do things like download documents or order items or services. Each individual company application would sit on an individual physical server. Each time IT planned to create a new application, the team would have to decide how big of a server to buy to host the application. Deciding on the size of the server to purchase was always a conundrum as IT could never be sure how much computer resources—computation (CPU), memory, and storage—would be needed to host a new application. If IT ordered a server that had far more resources than was needed to host a particular application, IT could get scolded for spending too much money without cause. IT could almost always buy a server that fit the application’s needs for the first few months. But if the application turned out to be more popular than expected, in time the server’s resources would not be enough to handle the traffic, which would hinder the application’s performance. Hence, IT could get scolded for not thinking ahead and planning for growth in traffic.
When an application on a physical server lacks needed resources to handle increased traffic, the application slows down or, worse, crashes. If you’ve ever tried to complete a form on a website and sat there waiting for the app to let you proceed from one question to the next, you know how frustrating it is to interact with an app that lacks resources to work well. But those problems became rare when companies began using VMs because more resources could easily be allocated to handle nearly any amount of traffic.
As stated above, a virtual machine is a virtual computer that serves information to a user. To better understand what a VM is, let’s start with something we know: a Word document. Although a VM is more complex than Word document, a VM is kind of like a Word document in that they are both virtual—meaning you can’t touch them—and they both contain information. The VM contains images and code just as a Word document may contain images and text, which is a type of code, or language. The text in a Word document could be in English, Spanish or some other language, and the code in the VM could be in Java, C#, or another language.
To create a Word document, you need special software like Microsoft Word, Word Perfect, or Google Docs. So, too, to create a VM you need software like VMware, VirtualBox, or QEMU.
Using a physical computer and software, a user can create a VM. The VM can reside on a physical computer, but VMs take up a lot of resources, so users at organizations typically use their computer to create a VM but save it on a VM server, which is a physical server. These days, VM servers are just referred to as servers, but they’re called VM servers because they were purpose-built to house virtual machines. VM servers can host dozens of VMs and can be located in a company’s data center, in a data center the company rents, or in a data center belonging to a cloud service provider like Amazon Web Services (AWS) or Microsoft Azure.
Just as you can create many instances of the exact same Word document and can save them in different folders on your computer, on other computers, on storage devices, or in the cloud like in Dropbox, you can also create different instances of a VM containing an application that can be stored on various computers, servers, storage devices, and in the cloud. The benefit of having different instances of a VM is that if one gets too much traffic at any point, you can spin up another instance of the VM quickly to handle the extra traffic. And if one VM breaks, another VM can come on line automatically and take over seamlessly, so users are totally unaware of the troubles behind the scenes.
A computer or server that hosts a VM is known as the host, while the VMs that reside on it are known as guests. There is a layer of software in between the host OS and the VM called the hypervisor, which creates and runs VMs on a server (See fig. 1). The hypervisor allows one host to support multiple guest VMs by sharing the server’s resources—memory, processing and storage.
Fig. 1. VM uses its host OS as well as its own OS (Source)
A system administrator allocates a portion of the server’s resources, depending on the application’s need, to each VM running on it. Resources consist of things like device drivers for a graphics card, sound card, network adapter, hard drives, keyboard and mouse, as well as the memory, storage, and CPU—which controls all the functions of the computer, so it can listen to instructions from your devices and software and execute those functions.
In addition to using a portion of the server’s operating system (OS), each VM has its own operating system (OS), which could be any brand—Mac, Linux, Windows, Ubuntu, or something else. The VM’s own OS can be entirely different from that of the server. As long as the VM has the necessary system components for an application, VMs can run any OS you want. The OS includes the various drivers like audio cards, which probably will never be used. All those resources—two operating systems—make VMs a heavyweight solution. Those heavy resources are not only redundant but there are associated costs: licensing fees for using the operating systems, money to run all those resources, and staff time waiting for all those resources to start up. If you don’t like waiting for your computer or mobile phone to start up, you can imagine what it’s like for a developer or systems administrator to sit and wait for their VMs to start.
VMs are typically thought of as being secure because each VM is separated from one another and does not communicate with any other VM. So if one gets attacked, it likely won’t affect the others. However, policies still must be put in place because services running on a VM’s OS could be used to inject malicious code into the host system. But you can use security features on your host and guest systems to help prevent that.
Traditional applications, often referred to as monolithic applications, were all written in one large chunk of code, all self-contained in one block, yet separated into different categories. Each category in a monolithic app is still closely tied to other categories and other services within any category (See fig. 2).
Fig. 2. Monolithic application that connects to a relational database management system (Source)
In the above diagram of an e-commerce application, each of those categories highlighted in blue may contain a variety of services, or functions, built into them. For example, the shopping cart service may have numerous functions—one that puts items in a cart, another that keeps track of loyalty points, another that recommends other products—and may supply dozens of other functions. In monolithic applications, the various services are all written in the same language and are tightly coupled, which means they are highly interdependent upon one another. If one of those components falters, there could be ripple effects throughout the application, causing other functions in other categories to fail, or worse, the entire application to crash. These monolithic applications that used to be stored on physical servers in a company’s data center are now being moved to VMs. The benefit of that is not only does moving these apps off hardware servers and onto VMs free up physical space, it allows you to create numerous instances of a VM. If a monolithic app on your hardware were to fail, the entire app would go offline until you could fix it. If you’ll remember earlier, we said you can store many instances of a VM anywhere: on a computer, on a server, or in the cloud. If a VM with an app on it were to go offline, you would simply spin up another instance of that VM. That is one reason why companies are moving applications to the cloud.
Companies today typically build monolithic apps only for small applications. For example, if a company wanted to create an application for employees to update their contact information and add two people to contact in case of an emergency, that would be a small application, so it would make sense to create that using a monolithic architecture in a VM. But if an ecommerce were creating an application to sell its goods, that might include a variety of services. For example, when you go to the websites of the largest retailers, the catalog of goods alone consists of thousands of items. The shopping cart service alone may consist of suggestions for other items, discount codes, and loyalty points. Other parts of the application might include gift guides, subscriptions, deals of the day, gift cards, ordering, order history and much more. If all that were in a monolithic application, were one thing to go wrong, it could cause problems with other parts of the application.
Instead of creating large applications all in one block of code where components are tightly coupled, companies are creating cloud native applications that use a microservice architecture, which breaks down large application structures into smaller, independent services that are not dependent upon a specific coding language. So instead of building one monolithic application with 100 services, using the programming language they are most comfortable coding in, developers create each one of those individual services as its own independent application. These small independent applications are typically created in containers or in a cloud provider’s serverless platform, such as AWS Lambda. These tiny applications could also be created in VMs, but because they consume so much resources, that would be tantamount to one person driving an 18-wheeler 300 miles to deliver one tiny package.
Containers are packages of software that include all the necessary elements to run an application in any environment whether that’s on software or hardware on-premises or in the cloud. Created just to do one single service, or function, and do it well, a container only comprises the application, the required binaries, libraries, and images: a file that includes executable code so it can run an isolated process. This small footprint allows you to host far more containers than VMs on a server.
In contrast to a VM, which includes its own OS, containers don’t have their own OS. That OS makes VMs heavy and causes their start-up time to be slow, often up to four minutes, whereas containers start up in milliseconds. But because VMs have their own OS, it doesn’t matter what OS the server is using. Containers, on the other hand, must be built to run on the OS of the server that will be hosting the container. So if a container is going to be run on Linux, it must be built to be compatible with Linux.
You can group many containers together in clusters to deploy a larger application. The clusters can be managed by container orchestrators like Kubernetes, allowing you to manage one cluster in one fell swoop.
Instead of running on top of a hypervisor as VMs do, containers run on a container engine, such as Docker or Oracle Cloud Infrastructure Compute, which runs the container (see fig. 3). On the host’s system, containers share a kernel—a piece of code in the OS that schedules programs to run. The container engine exposes parts of the host operating system into a partitioned area where the containers are, making them quick to start-up.
Whereas VMs are all isolated from one another on a server, containers share the memory of the host OS and that can be a security concern. To mitigate risk, network policies should be implemented and compliance requirements should be considered before production begins.
Fig. 3. Docker container sitting on host OS (Source)
Whether you use VMs in your private cloud, a public cloud or on hardware on premises, you’ll need to know the best ways to manage them and to use various technologies that help with that, such as Red Hat Virtualization suite, VMware, and XenApp.
VMs are a great way to move legacy and traditional applications to the cloud and to host small application. Containers work great for creating large applications, adopting a cloud native architecture and allowing developers to develop in the programming language of their choice.
Interested in Virtual Machines?
Browse Course Catalog