Secure virtualised workloads

workload: any job/payload that needs to be executed on infrastructure
- straight on OS
- in a VM
- in a container

Virtual machines

physical hardware
- CPU, memory, chipset, I/O...
- resources often underutilized
- no isolation
hardware-level abstraction
- virtual hardware
- encapsulate all OS and application state
virtualization software
- hypervisor/VMM
- extra level of indirection to decouple hardware and OS
- strong isolation between VMs
- improves utilization
secure multiplexing
- isolation on hardware level
- failure of one VM does not affect others
entire VM is a file
- easy to snapshot, clone, move, distribute
create once, run anywhere (well we try)
types
- type 1: hypervisor runs on bare metal (no host OS) (VMWare, Microsoft Hyper-V, KVM...)
- type 2: hypervisor runs on host OS (Virtualbox, VMWare Workstation...)
  - relies on host OS to manage calls to hardware
  - adds latency
  - security risks of host OS exploitable
  - aimed towards developers

{width=50%} \ {width=50%}

Containers

virtualization on OS level
much more lightweight -> more dense utilization
share same host OS / kernel
advantages
- much faster startup
- easier to manage
- more containers per host than VMs
no hardware isolation, so security issues
the future
- blur the line between contains and VMs
- Kata-containers: lightweight VM per container (better security)
- Microsoft HyperV: sometimes wraps containers in lightweight VM
Linux Security Modules (LSM)
- hostile processes can break out of container (badly configured namespaces, kernel exploits...)
- LSM defines mandatory access control
- lists allowed capabilities (syscalls) per process
- defined by sysadmin
- prevents niche syscalls from being exploited
types
- OS-level containerization: spawn containers straight on host OS + kernel
  - isolation using kernel functionality (namespaces, cgroups...)
  - no need for full guest OS
  - no hardware extensions
  - attackers could escape container and compromise host
  - Docker
- micro-VM: containers in lightweight VMs on host
  - utilizes hardware-enforced isolation
  - containers do not share kernel
  - safer
  - slower startup, worse performance
- unikernel: application compiled together with tailored kernel
  - monitor appplication on syscalls used
  - once known, construct microkernel and fixed-purpose image
  - no user space, only kernel space
  - much smaller attack surface (kernel only contains what's necessary)
  - runs straight on hypervisor or bare metal
  - small footprint, quick to start
- sandboxing: container in sandbox running copy of host kernel
  - syscalls translated to host kernel
  - good isolation
  - slow
  - not all syscalls supported (yet)

{width=50%} \ {width=50%}

Linux kernel isolation support

[https://linuxcontainers.org/]
built into Linux kernel
LXC (Linux Containers)
- OS-level virtualization for running containers on Linux host
- low-level, difficult to use
LXD (Linux Container Hypervisor)
- built on top of LXC
- Canonical development
- focus on containerising entire operations systems, not individual applications

Cgroups

control groups
Linux feature to separate processes into groups
- resource limiting e.g. cpu shares
- prioritization e.g. cpu pinning
- device access

Namespaces

provide isolated view of global resources for a group of processes
- only see other processes in namespaces
- only see allowed devices, users, file system...
- 2 PIDs: global one and one within namespace
- own root file system (copy of host root)

WebAssembly

W3C standard for portable high-performance applications
binary code
- compiled to virtual CPU
- runs in runtime
portable compilation target
near-native performance
WebAssembly System Interface (WASI): OS-level functionality + integrated security

Trusted execution environment

confidential computing: protect data in use
- at-rest data: data on storage, just encrypt it
- in-transit data: use ewncryption
- in-use data: needs to be decrypted before it can be used in application
- TEE looks to address data in use security concern
protect guest from untrustworthy host
- confidentiality: unauthorized entities cannot view data used in TEE, data is encrypted in-memory
- integrity: prevent tampering (checksums)
- provable origin: hardware-signed evidence of origina and current state so client can verify and decide to trust code running in TEE
AMD Secure Encrypted Virtualization (SEV, SEV-ES)
Intel Software Guard Extensions (SGX)
Intel Trusted Domain Extensions (TDX)

5.3 KiB Raw Blame History

Secure virtualised workloads

Virtual machines

Containers

Linux kernel isolation support

Cgroups

Namespaces

WebAssembly

Trusted execution environment

5.3 KiB

Raw Blame History