net-sec-samenvatting/09_secure_virtualised_workl...

149 lines
5.3 KiB
Markdown

# Secure virtualised workloads
* **workload**: any job/payload that needs to be executed on infrastructure
* straight on OS
* in a VM
* in a container
## Virtual machines
![VM architecture](./img/ch09/vm_diagram.png)
* physical hardware
* CPU, memory, chipset, I/O...
* resources often underutilized
* no isolation
* hardware-level abstraction
* virtual hardware
* encapsulate all OS and application state
* virtualization software
* hypervisor/VMM
* extra level of indirection to decouple hardware and OS
* strong isolation between VMs
* improves utilization
* secure multiplexing
* isolation on hardware level
* failure of one VM does not affect others
* entire VM is a file
* easy to snapshot, clone, move, distribute
* create once, run anywhere (well we try)
* types
* **type 1**: hypervisor runs on bare metal (no host OS) (VMWare, Microsoft
Hyper-V, KVM...)
* **type 2**: hypervisor runs on host OS (Virtualbox, VMWare Workstation...)
* relies on host OS to manage calls to hardware
* adds latency
* security risks of host OS exploitable
* aimed towards developers
![Type 1 virtualisation](./img/ch09/type_1_hypervisor.png){width=50%} \ ![Type 2 virtualisation](./img/ch09/type_2_hypervisor.png){width=50%}
## Containers
* virtualization on OS level
* much more lightweight -> more dense utilization
* share same host OS / kernel
* advantages
* much faster startup
* easier to manage
* more containers per host than VMs
* no hardware isolation, so security issues
* the future
* blur the line between contains and VMs
* **Kata-containers**: lightweight VM per container (better security)
* **Microsoft HyperV**: sometimes wraps containers in lightweight VM
* Linux Security Modules (LSM)
* hostile processes can break out of container (badly configured
namespaces, kernel exploits...)
* LSM defines mandatory access control
* lists allowed capabilities (syscalls) per process
* defined by sysadmin
* prevents niche syscalls from being exploited
* types
* **OS-level containerization**: spawn containers straight on host OS + kernel
* isolation using kernel functionality (namespaces, cgroups...)
* no need for full guest OS
* no hardware extensions
* attackers could escape container and compromise host
* Docker
* **micro-VM**: containers in lightweight VMs on host
* utilizes hardware-enforced isolation
* containers do not share kernel
* safer
* slower startup, worse performance
* **unikernel**: application compiled together with tailored kernel
* monitor appplication on syscalls used
* once known, construct microkernel and fixed-purpose image
* no user space, only kernel space
* much smaller attack surface (kernel only contains what's necessary)
* runs straight on hypervisor or bare metal
* small footprint, quick to start
* **sandboxing**: container in sandbox running copy of host kernel
* syscalls translated to host kernel
* good isolation
* slow
* not all syscalls supported (yet)
![Container layout](./img/ch09/container.png){width=50%} \ ![Micro-VM layout](./img/ch09/micro_vm.png){width=50%}
![Unikernel layout](./img/ch09/unikernel.png){width=50%} \ ![Sandbox layout](./img/ch09/sandbox.png){width=50%}
## Linux kernel isolation support
* [https://linuxcontainers.org/]
* built into Linux kernel
* LXC (Linux Containers)
* OS-level virtualization for running containers on Linux host
* low-level, difficult to use
* LXD (Linux Container Hypervisor)
* built on top of LXC
* Canonical development
* focus on containerising entire operations systems, not individual applications
### Cgroups
* control groups
* Linux feature to separate processes into groups
* resource limiting e.g. cpu shares
* prioritization e.g. cpu pinning
* device access
### Namespaces
* provide isolated view of global resources for a group of processes
* only see other processes in namespaces
* only see allowed devices, users, file system...
* 2 PIDs: global one and one within namespace
* own root file system (copy of host root)
## WebAssembly
* W3C standard for portable high-performance applications
* binary code
* compiled to virtual CPU
* runs in runtime
* portable compilation target
* near-native performance
* WebAssembly System Interface (WASI): OS-level functionality + integrated
security
## Trusted execution environment
* confidential computing: protect data in use
* at-rest data: data on storage, just encrypt it
* in-transit data: use ewncryption
* in-use data: needs to be decrypted before it can be used in application
* TEE looks to address data in use security concern
* protect *guest* from untrustworthy *host*
* confidentiality: unauthorized entities cannot view data used in TEE, data
is encrypted in-memory
* integrity: prevent tampering (checksums)
* provable origin: hardware-signed evidence of origina and current state so
client can verify and decide to trust code running in TEE
* AMD Secure Encrypted Virtualization (SEV, SEV-ES)
* Intel Software Guard Extensions (SGX)
* Intel Trusted Domain Extensions (TDX)
![Container architecture](./img/ch09/container_diagram.png)