Docker 简明教程

Docker - Data Storage

从设计上讲,通常不应将数据直接持久化到 Docker 容器中,原因有几个。首先,容器一直旨在是瞬态的。换句话说,它们可以随时停止、启动或(理论上)销毁。因此,每次容器停止存在时,存储在容器内部的数据都会丢失。根据上述内容,这使得数据持久化和恢复数据变得很困难。

By design, data should not generally be persisted directly in a Docker container for a few reasons. First, containers were always intended to be transient. In other words, they can be stopped, started, or, in theory, destroyed at any time. Data that are stored inside a container is consequently lost each time a container stops existing. With that said, this makes data persistence and recovering your data hard.

其次,容器的可写层可以与运行它的主机密切协调,通常使其难以将其移动到另一台机器或提取数据。此外,通常使用存储驱动程序和联合文件系统在该层中进行写入,这可能与主机的文件系统写入相比导致性能开销。

Second, the writable layer of a container can be heavily coordinated with the host machine on which it is running, often making it hard to move it to another machine or to extract data. Furthermore, the writing in this layer is usually performed using a storage driver and a union file system, which may cause performance overhead compared to the writing of the host’s file system.

数据也可以存储在容器中。这可能导致扩展和共享问题,因为多个容器可能希望访问相同的数据,从而使管理和保持所述数据同步变得复杂。这就是为什么最好使用 Docker 卷或绑定装载来存储容器之外的数据的原因,这将提供持久性、可移植性和轻松访问。

Data can be stored within a container, too. This can lead to problems with scaling and sharing, as more than one container may wish to access the same data, making management and keeping the said data in synchronization complex. That is why it is much better to use the Docker volumes or bind mounts for storing data out of the container, which will provide persistence, portability, and easy access.

在本章中,让我们讨论如何使用卷和绑定装载来持久化 Docker 容器中的数据。

In this chapter, let’s discuss on how volumes and bind mounts can be used to persist data in Docker containers.

Different Ways to Persist Data in Docker Containers

无论您使用装载类型 volumebind mount 还是 tmpfs,容器中的数据都以容器文件系统中的目录或文件形式呈现。以下是关键区别:持久数据所在的 Docker 主机上的位置。

Whether you use mount types volume, bind mount, or tmpfs, the data inside the container is presented as a directory or file within the container’s filesystem. Here is the crucial difference: the location on the Docker host where the persistent data resides.

卷驻留在主机文件系统由 Docker 管理的部分中,在 Linux 上通常位于 ` /var/lib/docker/volumes/ `。非本地运行的 Docker 进程无法访问此区域,因此卷是将数据持久保存在 Docker 中的唯一适用机制。

Volumes live in a Docker-managed part of the host filesystem, usually at /var/lib/docker/volumes/ on Linux. This area is not accessible to natively running Docker processes, so volumes are the only applicable mechanism for holding data persistently in Docker.

另一方面,绑定装载可以位于主机系统的任何位置,甚至是一些关键系统文件,因此可以被 Docker 不管理的进程更改。这使它们更灵活,但更不隔离。最后,tmpfs 装载仅存在于主机系统的内存中,并且永远不会触及底层文件系统 - 非常适合临时、非持久数据。

Bind mounts, on the other hand, can be located anywhere in a host system, even some crucial system files, and therefore, can be changed by a process not managed by Docker. This makes them more flexible but less isolated. Finally, tmpfs mounts exist only in the host system’s memory and never touch the underlying filesystem - perfect for ephemeral, non-persistent data.

` -v ` 或 ` --volume ` 标志允许为卷或绑定装载指定装载点。语法略有不同:对 tmpfs 装载使用 ` --tmpfs ` 标志。但为了最大程度提高可读性和清晰度,尽可能使用 ` --mount `,并将所有选项合并并嵌套在内部。

The -v or --volume flag allows specifying a mounting point for volumes or bind mounts. The syntax is slightly different: use the --tmpfs flag for tmpfs mounts. But for maximum readability and clarity, whenever possible, use --mount with all the options merged and nested inside.

Docker Volumes

卷是持久化由 Docker 容器生成和使用的首选方式。Docker 管理它们,并且独立于主机机器的文件系统。与其他存储策略(如绑定装载)相比,使用它们还有几个好处。

Volumes are the preferred way for persisting data generated by and used in Docker containers. Docker manages them and is independent of whatever the host machine’s filesystem is. There are also several benefits to using them over other storage strategies like bind mounts.

Key Features of Docker Volumes

  1. Persistence − Data stored in volumes will outlive the lifecycle of a stopped, removed, or replaced container.

  2. Portability − It’s easy to backup, migrate, or share among multiple containers with volumes.

  3. Management − Control and manage Docker volumes with Docker CLI commands or via the Docker API.

  4. Cross-platform compatibility − Runs on Linux and Windows containers with remarkable consistency.

  5. Performance − Volumes have more optimal performance with Docker Desktop than bind mounts from Mac and Windows hosts.

Creating a Volume

这是使用名称“my-vol”创建新卷的基本命令。

This is the basic command to create a new volume with the name "my-vol."

$ docker volume create my-vol

Attach a Volume to a Container

以下命令将“my-vol”卷附加到容器内的“/app/data”目录。如果任何数据被写入此目录,它将永久存储在卷中。

The below command attaches the "my-vol" volume to the "/app/data" directory within the container. If any data is written to this directory, it will be stored in the volume persistently.

$ docker run -d --name my-container -v my-vol:/app/data my-image

Listing Volumes

此命令列出 Docker 环境中可用的所有卷。

This command lists all the volumes that are available in your Docker environment.

$ docker volume ls

Inspecting a Volume

此命令给出了有关卷的详细信息,包括装入点、驱动程序以及其他详细信息。

This command gives detailed information about the volume, including the mount point, driver, and other details.

$ docker volume inspect my-vol

Removing a Volume

此命令删除了“my-vol”卷。警告:卷中的数据将不可逆的损毁。

This command removes the "my-vol" volume. Warning: The data in the volume is destroyed irreversibly.

$ docker volume rm my-vol

Real-World Use Cases of Docker Volumes

  1. Databases − The database files of the data should be stored in a volume that will make it persistent across all container restarts.

  2. Web Server Content − Storing website files or user uploads within a volume, so even when the web server container is replaced, they remain accessible.

  3. Application Logs − Store logs in a volume for easy analysis and persistence.

Docker 卷为容器化应用程序中的持久性数据带来了强大且灵活的管理。即使在动态容器环境中使用卷,数据仍然受到保护并且可以访问。

Docker volumes bring strong and flexible management of persistent data inside contained applications. Data remains secured and accessible even with the leverage of volumes in dynamic container environments.

Bind Mounts

Docker 中的绑定装入是一种直接从主机中将文件或目录共享到 Docker 应用程序的方式。绑定装入直接将主机中的文件或目录关联到容器中的路径;与卷不同,它们不需要管理,因为 Docker 管理它们。

Bind mounting in Docker is a way to directly share files or directories from the host machine into a Docker application. Bind mounts directly associate a file or directory from the host machine to a path in the container; unlike volumes, they do not need to be managed since Docker manages them.

Key Features of Mount Bind

  1. Direct Access − Any changes made to the files on the host are immediately reflected within the container, and vice versa.

  2. Flexibility − You can mount any location on your host system, including system files, configuration files, or your project’s source code.

  3. Development Workflow − In development, bind mounts prove to be a boon for you, as you can edit code on your host drive, and the changes taking place in the running container are seen close to immediately.

Mount Host Directory

以下命令将机器上的当前目录装入到容器的“/app”目录中。对当前目录中文件的任何更改都将反映在容器中,反之亦然。

The below command mounts the current directory on your machine to the container’s '/app' directory. Any changes to the files inside the current directory will reflect inside the container and vice versa.

$ docker run -d --name my-container -v $(pwd):/app my-image

Mount a Single File

这会将主机文件“file.txt”装入到容器中的路径“/etc/config.txt”中。

This would mount the host file "file.txt" to the path "/etc/config.txt" in the container.

$ docker run -d --name my-container -v /path/to/file.txt:/etc/config.txt my-image

Using the --mount Flag

--mount 标志允许在绑定装入上进行更详细的说明,明确说明其类型、源和目标。

The --mount flag allows for more verbose specification on a bind mount, stating its type, source, and target explicitly.

$ docker run -d --name my-container --mount
   type=bind,source="$(pwd)",target=/app my-image

Real-Life Applications of Bind Mounts in Docker

  1. Dev Environments − Make directories containing source codes mountable so that changes in the source can be live updated.

  2. Configuration Files − Mount your host’s configuration files into the container to customize its behavior.

  3. Share Host Resources − mount files or directories that need access from the container - e.g., log files and data files.

Named Pipes and TMPFS

在 Docker 中,您可以在主机系统内存中使用 tmpfs 装入和命名管道来存储数据,尽管它们在不同的操作系统中实现方式不同。

In Docker, you can store data in hosts’ system memory with the help of tmpfs mounts and named pipes, though they are implemented differently in different operating systems.

tmpfs Mounts (Linux)

在 Linux 上使用 Docker 时,tmpfs 装入用于创建内存中保存的临时文件系统。这意味着以 tmpfs 装入写入的文件不会持久存储到磁盘,因此非常适合存储敏感信息或临时数据,不需要在容器中持续存在。

When using Docker on Linux, a tmpfs mount is used to create a temporary filesystem held in memory. This implies that files written in a tmpfs mount are not persisted to disk and are hence ideal for the storage of sensitive information or temporary data, not needing to outlive the Container.

tmpfs 在内存中运行;因此,它的读写速度比旧的基于磁盘的存储方法快得多。但是,tmpfs 中的数据是易失性的,并且在主机系统重新启动或容器停止时将会丢失。

tmpfs operates from memory; therefore, it makes it a lot faster in reading and writing than the old, disk-based storage method. However, the data in tmpfs is volatile and will be lost in case the host system reboots or the container is stopped.

Named Pipes (Windows)

在 Windows 中,命名管道与用于将数据存储在内存中的 tmpfs 挂载十分类似。它们使进程可以相互通信,而后者可以将数据存储在容器的临时内存中。

In Windows, named pipes work pretty similarly to tmpfs mounts to store data in memory. They enable processes to communicate with each other, which can store their data in the container’s temporary memory.

与 tmpfs 类似,命名管道的内容不会写入磁盘,并且会在容器停止后丢失。命名管道是 Windows 中进程间通信的基本机制之一,并且 Docker 利用其功能在 Windows 主机上提供内存存储功能。

Like tmpfs, the contents of named pipes are not written to a disk, and they are lost once the container stops. Named pipes are one of the basic mechanisms of inter-process communication in Windows, and Docker utilizes their functionalities to provide in-memory storage capabilities on Windows hosts.

tmpfs 挂载和命名管道均设计为在性能至关重要,但数据持久性不重要的使用案例中提供支持。它们很好地满足了存储临时文件、缓存或敏感信息的需求,而不应该将其写入磁盘。

Both tmpfs mounts and named pipes are designed to be supportive in use cases where performance, and not the persistence of data, is vital. They serve well to store temp files, cache, or sensitive information that should not be written on the disk.

When to Use Docker Volumes and Bind Mounts?

卷是处理 Docker 内持久性存储的最佳方式。它非常适合在容器之间共享数据(在其中无法保证主机的文件结构)、存储远程数据、必须备份、还原或迁移数据的情况等等。此外,卷的性能更高,并且可以为 Docker Desktop 上的 I/O 密集型应用程序提供本机文件系统行为。

Volumes are the best way to handle persistent storage within Docker. It is perfect for sharing data between containers, where you cannot guarantee the host’s file structure, storing data remotely, situations in which you have to back up, restore, or migrate data, and much more. In addition, volumes are more performant and natively provide the file system behavior for I/O-intensive applications on Docker Desktop.

与之相反,绑定挂载直接将主机上的文件/目录链接到容器的路径。在允许用户在主机和容器之间共享配置文件或源代码时,它们通常很有帮助,特别是在开发环境中。但是,在使用敏感数据的绑定挂载时务必小心,因为容器中的更改会直接影响主机。

Bind mounts, in contrast, link files/directories from the host directly to the container’s path. Often, they are helpful by allowing a user to share a configuration file or source code between the host and the containers, especially in the development environment. But be cautious while using bind mounts with sensitive data because changes in the container directly impact the host.

Tmpfs 挂载完全基于内存且为临时挂载,与非持久性数据(如缓存或敏感信息)非常匹配。它们专注于速度和安全性,因此不关注数据持久性。

Tmpfs mounts, being completely memory-based and temporary, fit exceptionally well with non-persistent data, like caches or sensitive information. They focus on speed and security, hence data persistence is not their concern.

Conclusion

所以,你已经掌握了它。你已经掌握了 Docker 存储选项:卷、绑定挂载和 tmpfs 挂载,以优化容器化应用程序中的数据管理。了解它们之间的差异将使你能够明智地选择在何处以及如何存储数据。

So there you have it. You have mastered Docker storage options: volumes, bind mounts, and tmpfs mounts to optimize data management in containerized applications. Knowing their differences will let you make an educated choice of where and how to store your data.

卷提供了持久性、可移植性和隔离性,使其适合保存需要比个别容器生存时间更长的宝贵数据。绑定挂载更灵活,可以提供对主机文件的实时访问;它们有助于开发和共享特定资源。可以挂载 Tmpfs,优先考虑速度和安全性;它为敏感或瞬态数据提供内存中的临时存储。

Volumes give you persistence, portability, and isolation, making them suitable to hold precious data that needs to live longer than individual containers. Bind mounts are much more flexible and can offer real-time access to host files; they are helpful for the development and sharing of particular resources. Tmpfs can be mounted, prioritizing speed and security; it gives temporary storage in memory for sensitive or transient data.

很多内容取决于你需要的特定要求和使用案例,为此你需要合适的存储机制。通过考虑数据持久性、访问模式和性能需求等因素,Docker 的存储选项使人能够构建高效可靠且安全的容器化应用程序。

A lot shall depend on the specific requirements and use case for which you’ll need the proper storage mechanism. By considering factors like data persistence, access patterns, and performance needs, Docker’s storage options enable one to build an efficient reliable, and secure containerized application.

FAQ About Docker Data Storage

Q 1. What happens to my data when a Docker container stops or is removed?

当容器停止或删除时,直接在可写容器层中创建的数据将丢失。这是因为容器应该被临时创建。

The data created directly in the container writable layer is lost when a container is stopped or removed. This is because the containers are supposed to be ephemerally created.

为了确定数据的持久性,已部署诸如 Docker 卷和绑定挂载等数据存储机制,通过这些机制,数据将外部存储在主机文件系统上或使用 Docker 管理的存储并链接到容器。

To ascertain the persistence of data, data storage mechanisms such as Docker volumes and bind mounts are put in place, whereby the data is stored externally on the host filesystem or using Docker-managed storage and linked to the container.

Q 2. What is the difference between Volumes and Bind Mounts in Docker?

卷是 Docker 处理持久性存储的首选方式。由 Docker 本身管理,它们独立于任何单个容器存在,但某些平台为它们提供了出色的生命周期特性,如可移植性、轻松备份和更好的性能。

Volumes are the Docker-preferred way to deal with persistent storage. Managed by Docker itself, they reside independently of any single container but in a way that some platforms provide excellent lifecycle features for them, like portability, easy backup, and better performance.

绑定挂载是从容器到主机目录或文件的直接链接。通过这种方式,它们用于与容器直接共享文件,并且实际上优于卷:在大多数方面,更安全且更加可移植。

Bind mounts are direct links into a directory or file of the host from the container. This way, they serve to share files directly with the container and are, in fact, superior to volumes for this purpose: in most ways, more secure and certainly more portable.

Q 3. What is a tmpfs mount in Docker, and when should they be used?

Tmpfs 挂载是仅驻留在主机系统内存中的临时文件系统。因此,它们不是持久的,因此非常适合确保敏感数据或临时文件不会超过容器的寿命。

Tmpfs mounts are temporary filesystems that solely reside in the host system’s memory. As such, they are not persistent and are thus ideal for ensuring that sensitive data or temporary files do not outlive the container’s life.

虽然 tmpfs 挂载及其读写操作都适用,但它们是不稳定的:当主机重新启动或容器停止时,数据将丢失。

Though tmpfs mounts are suitable with their read and write operations, they are volatile: the data gets lost when the host reboots or the container is stopped.

Q 4. Can I use AWS S3 or Azure Blob storage solutions with Docker?

使用 Docker,你仍然可以使用任何云存储解决方案。但它更多的是与云提供商的 SDK 或 API 进行交互,而不是将他们的云存储直接挂载到卷中。这允许用户从云中存储和检索数据;该解决方案几乎可以无限扩展且持久耐用。

With working through Docker, you can still utilize any of the cloud storage solutions. Still, it’s more about interfacing with the SDK or API for the cloud provider rather than mounting their cloud storage directly into a volume. This allows the user to store and retrieve data from the cloud; the solution is almost infinitely scaled and is durable.

Q 5. How can I secure the data stored in Docker volumes?

保护 Docker 卷中数据的潜在措施包括使用适当的权限隔离对卷的访问,并谨慎避免使用共享卷,为你的卷制定适当的备份机制,以便定期将它们备份到外部存储以进行灾难恢复,对你存储的敏感数据或加密主机的整个文件系统加密卷的内容,最后,让自己不断了解 Docker 的新安全更新和最佳实践,以保护你的环境免受漏洞侵害。

Potential measures of securing data in Docker volumes include isolating access to the volumes using proper permissions and avoiding shared volumes without discretion, having proper backup mechanisms in place for your volumes by regularly backing them up to external storage for disaster recovery, encrypting the contents of the volume in case you are storing sensitive data, or encrypt the whole file system of the host, and finally, keeping yourself constantly up to date with a new security update and best practice from Docker to secure and protect your environment from vulnerabilities.