FileChannel, ByteBuffer, Memory-mapped I/O, Locks - Feature image

Java files, part 6: FileChannel, ByteBuffer, Memory-mapped I/O, Locks

​​​​February 26, 2020 / Leave a Comment

The previous five parts of this article series covered reading and writing files, directory and file path construction, directory and file operations, and writing and reading structured data.

In today’s part, I explain the NIO classes FileChannel and ByteBuffer introduced in Java 1.4 with JSR 51 (“New I/O APIs for the JavaTM Platform”). Moreover, I show what possibilities they provide to read and write files and what their advantages are – compared to the methods discussed before.

In detail, you’ll learn:

  • What are FileChannels and ByteBuffers, and what are their advantages?
  • How to write and read files with FileChannel and ByteBuffer?
  • What are memory-mapped files, and what are their advantages?
  • How to lock specific sections of a file?
  • Which write method has the best performance?

You can find the code from this article in my GitLab Repository.

Terminology

What is a FileChannel?

A Channel is a communication link to a file, socket, or another component that provides I/O functionality. Unlike InputStream or OutputStream, a Channel is bidirectional, which means you can use it for both writing and reading.

A FileChannel is a Channel for connecting to a file.

What is a ByteBuffer?

A ByteBuffer is basically a byte array (on the Java heap or in native memory), combined with write and read methods. This encapsulation allows writing to or reading from the ByteBuffer without having to know the position of the written / read data within the actual array.

I describe the exact functionality of the ByteBuffer here.

File access with FileChannel + ByteBuffer

To write data into a FileChannel or read from it, you need a ByteBuffer.

Accessing a file via FileChannel and ByteBuffer
Accessing a file via FileChannel and ByteBuffer

Data is put into the ByteBuffer with put() and then written from the buffer to the file with FileChannel.write(buffer). FileChannel.write() calls get() on the buffer to retrieve the data.

Using FileChannel.read(buffer) data is read from the file. The read() method puts the data into the ByteBuffer with put(), and from there, you can retrieve it with get().

Advantages of FileChannel

FileChannel provides the following advantages over the FileInputStream and FileOutputStream classes introduced in the first two parts of the series:

  • You can read and write at any position within the file.
  • You can force the operating system to write changed data from the cache to the storage medium.
  • You can map sections of a file to memory (“memory-mapped file“), which allows very efficient data access.
  • You can set locks on file sections so that other threads and processes cannot access them simultaneously.
  • Data can be transferred very efficiently from one channel to another.

Reading and writing files with FileChannel and ByteBuffer

In this chapter, I show you – using code examples – how to read and write data with FileChannel and ByteBuffer, how to access certain positions within the file, how to determine and change the file size, and how to force writing from cache to storage media.

How to read a file using FileChannel?

Opening a FileChannel to read a file

To read a file, you must first open a FileChannel. The most direct way to do this is:

Path path = ...;
FileChannel channel = FileChannel.open(path, StandardOpenOption.READ);

(You can read about how to construct a Path object in the third part of this series.)

Alternatively, you can create a FileChannel from a RandomAccessFile:

Path path = ...;
RandomAccessFile file = new RandomAccessFile(path.toFile(), "r");
FileChannel channel = file.getChannel();

… or from a FileInputStream:

Path path = ...;
FileInputStream in = new FileInputStream(path.toFile());
FileChannel channel = in.getChannel();

In this example, it makes no difference which variant you choose. In the end, the getChannel() methods create a new FileChannel using the file information stored in RandomAccessFile or FileInputStream. So, although you can only read data sequentially from a FileInputStream, this restriction does not apply to the FileChannel created from the FileInputStream.

However, the readable and writable flags are set accordingly:

  • A FileChannel created with FileInputStream.getChannel() can only be used for reading.
  • A FileChannel created with RandomAccessFile.getChannel() can be used for reading and writing.
  • How you can use a FileChannel created with FileChannel.open() is determined by the options passed in as the second parameter. Since we specified StandardOpenOption.READ in the example above, only read access is allowed in this case. (You can find an overview of the available options in the JavaDoc of StandardOpenOption.)

Reading a file with FileChannel and ByteBuffer

Once you have opened a FileChannel, you can read from it into a ByteBuffer with FileChannel.read(). The following example reads blocks of 1,024 bytes each, outputs their respective lengths and their first and last bytes – until the end of the file is reached:

Path path = Path.of("read-demo.bin");
try (FileChannel channel = FileChannel.open(path, StandardOpenOption.READ)) {
    ByteBuffer buffer = ByteBuffer.allocate(1024);

    int bytesRead;
    while ((bytesRead = channel.read(buffer)) != -1) {
        System.out.printf("bytes read from file: %d%n", bytesRead);
        if (bytesRead > 0) {
            System.out.printf("  first byte: %d, last byte: %d%n",
                    buffer.get(0), buffer.get(bytesRead - 1));
        }
        buffer.rewind();
    }
}

Using channel.read(buffer), we read as many bytes as possible from the file and put them into the buffer. With buffer.get(index), we read single bytes from the buffer without setting its read position before and without changing the read position in the process. Using buffer.rewind(), we set the buffer’s position to 0 at the end of the loop so that it can be filled again.

Reading a file with ByteBuffer.flip() and compact()

In the following second example, we proceed somewhat differently. We read all bytes of the buffer and sum them up. Instead of accessing the data with buffer.get(index), we first use buffer.flip() to set the read position to the beginning of the buffer and then buffer.get() to read single bytes from the current read position.

We do not read the entire buffer, but only a random number of bytes, and thus simulate that we cannot process the data completely. Then we switch back to buffer write mode with buffer.compact() and read more bytes from the FileChannel. For a better understanding, I recommend you read the article Java ByteBuffer: How to use flip() and compact().

Path path = Path.of("read-demo.bin");
try (FileChannel channel = FileChannel.open(path, StandardOpenOption.READ)) {
    ByteBuffer buffer = ByteBuffer.allocate(1024);

    int bytesRead;
    while ((bytesRead = channel.read(buffer)) != -1) {
        System.out.printf("bytes read from file: %d%n", bytesRead);

        long sum = 0;

        buffer.flip();
        int numBytesToRead = ThreadLocalRandom.current().nextInt(buffer.remaining());
        for (int i = 0; i < numBytesToRead; i++) {
            sum += buffer.get();
        }

        System.out.printf("  bytes read from buffer: %d, sum of bytes: %d%n",
                numBytesToRead, sum);
        buffer.compact();
    }
}

In the output, we see that the first call to channel.read() reads 1,024 bytes from the file, and each subsequent call reads precisely as many bytes as we previously read from the buffer (and as much free space has become available in the buffer accordingly).

How to write a file with FileChannel?

Opening a FileChannel for writing to a file

To write a file, you must open a FileChannel first. This works just as with reading:

Path path = ...;
FileChannel channel = FileChannel.open(path, 
        StandardOpenOption.CREATE, StandardOpenOption.WRITE);

Instead of StandardOpenOption.READ we specify StandardOpenOption.WRITE. Additionally, in the example, I specify StandardOpenOption.CREATE so that the file is created if it does not exist.

Other OpenOptions relevant to write operation are:

  • StandardOpenOption.CREATE_NEW: The file is created if it does not already exist; otherwise, a FileAlreadyExistsException is thrown.
  • StandardOpenOption.APPEND: Data is appended to the file. Position 0 of the FileChannel does not correspond to position 0 within the file but rather to the current length of the file.
  • StandardOpenOption.TRUNCATE_EXISTING: The file is completely cleared before writing. This option cannot be used together with APPEND.

Similar to reading, you can also create a writable FileChannel using RandomAccessFile.getChannel() or FileOutputStream.getChannel().

Writing to a file using ByteBuffer.flip() and compact()

The following example writes ten times a random number of bytes into the ByteBuffer and then from there into the FileChannel:

Path path = Path.of("write-demo.bin");
try (FileChannel channel = FileChannel.open(path,
        StandardOpenOption.CREATE, StandardOpenOption.WRITE)) {

    ByteBuffer buffer = ByteBuffer.allocate(1024);

    for (int i = 0; i < 10; i++) {
        int bytesToWrite = ThreadLocalRandom.current().nextInt(buffer.capacity());
        for (int j = 0; j < bytesToWrite; j++) {
            buffer.put((byte) ThreadLocalRandom.current().nextInt(256));
        }

        buffer.flip();
        channel.write(buffer);
        buffer.compact();
    }

    // channel.write() doesn't guarantee that all data is written to the channel.
    // If there are remaining bytes in the buffer, write them now.
    buffer.flip();
    while (buffer.hasRemaining()) {
        channel.write(buffer);
    }
}

The buffer’s write position is initially set to 0. With buffer.put(), we fill the ByteBuffer up to a random position. With buffer.flip(), we switch to buffer read mode; with channel.write(buffer), we write the contents of the buffer to the file. And with buffer.compact(), we switch the buffer back to write mode.

Calling channel.write(buffer) does not guarantee that the entire contents of the buffer are written to the channel. Therefore, in the end, we have to call channel.write(buffer) until buffer.hasRemaining() returns false, i.e., the buffer does not contain any more data.

Once again, I recommend reading the article Java ByteBuffer: How to use flip() and compact().

How to read or write data at a specific position

The read/write position within a channel can be read with FileChannel.position() and changed at any time with FileChannel.position(newPosition). In the following example, a file is written from back to front with the bytes 0xff to 0x00. Afterward, the content is read at ten random positions and displayed on the screen.

Path path = Path.of("position-demo.bin");
try (FileChannel channel = FileChannel.open(path,
        StandardOpenOption.CREATE, StandardOpenOption.WRITE, StandardOpenOption.READ)) {

    ByteBuffer buffer = ByteBuffer.allocate(1);

    // Write backwards
    for (int pos = 255; pos >= 0; pos--) {
        buffer.put((byte) pos);
        buffer.flip();
        channel.position(pos);
        while (buffer.remaining() > 0) {
            channel.write(buffer);
        }
        buffer.compact();
    }

    // Read from random positions
    for (int i = 0; i < 10; i++) {
        long pos = ThreadLocalRandom.current().nextLong(channel.size());
        channel.position(pos);
        channel.read(buffer);
        buffer.flip();
        byte b = buffer.get();
        System.out.printf("Byte at position %d: %d%n", pos, b);
        buffer.compact();
    }
}

In the section “Memory-mapped Files”, you will see how you can code this much more elegantly.

How to determine file size?

You can determine a file’s size as follows:

long fileSize = channel.size();

How to change the file size

How to expand a file?

When writing to a file, the file is automatically expanded when you write beyond the end of the file.

For example, you could create a 1 GB file (assuming it doesn’t already exist) containing 230-1 zeros and 1 one as follows:

Path path = Path.of("1g-demo.bin");
try (FileChannel channel = FileChannel.open(path,
        StandardOpenOption.CREATE, StandardOpenOption.WRITE)) {

    ByteBuffer buffer = ByteBuffer.allocate(1);
    buffer.put((byte) 1);
    buffer.flip();

    channel.position((1 << 30) - 1);
    channel.write(buffer);
}

How to shrink a file?

To reduce the size of a file, you have to call channel.truncate(). In the following example, the previously created 1 GB file is truncated to 1 KB:

Path path = Path.of("1g-demo.bin");
try (FileChannel channel = FileChannel.open(path, StandardOpenOption.WRITE)) {
    channel.truncate(1 << 10);
}

If the specified new size is greater than or equal to the current size, invoking truncate() has no effect.

How to force writing from file cache to storage medium?

For performance reasons, the operating system caches changes to files and usually does not write them immediately to the storage medium.

Using channel.force(boolean metaData), you can instruct the operating system to write all changes immediately. The parameter metaData determines whether metadata (such as the time of the last modification and the last access) should also be written to the storage medium right away:

  • If you specify true, also metadata is written immediately, which requires additional I/O operations and therefore takes longer.
  • If you specify false, only the actual file content is written.

There is no guarantee that the value of the metaData flag is respected on all operating systems.

Memory-mapped Files: How to map a file section into memory

A special kind of ByteBuffer is the MappedByteBuffer – it maps a section of a file directly into memory (therefore: “memory-mapped file”). This allows very efficient access to the file without having to use FileChannel.write() and read(). The MappedByteBuffer can be accessed like a byte array, i.e., it can be written to at any position and read from any position. Changes are written transparently to the file in the background.

Direct mapping results in an enormous performance gain over conventional reading and writing. The file is mapped directly into the memory’s “user space”. In contrast, with conventional writing and reading methods, data must be copied back and forth between “kernel space” and “user space”.

The following code is an elegant rewrite of the example from the section “How to read or write data at a specific position” (in which a file was written from back to front and then read at random positions) – using a MappedByteBuffer:

Path path = Path.of("mapped-byte-buffer-demo.bin");
try (FileChannel channel = FileChannel.open(path,
        StandardOpenOption.CREATE, StandardOpenOption.WRITE, StandardOpenOption.READ)) {

    MappedByteBuffer buffer = channel.map(FileChannel.MapMode.READ_WRITE, 0, 256);

    // Write backwards
    for (int pos = 255; pos >= 0; pos--) {
        buffer.put(pos, (byte) pos);
    }

    // Read from random positions
    for (int i = 0; i < 10; i++) {
        int pos = ThreadLocalRandom.current().nextInt((int) channel.size());
        byte b = buffer.get(pos);
        System.out.printf("Byte at position %d: %d%n", pos, b);
    }
}

Characteristics of memory-mapped files

Please note the following when using memory-mapped files:

  • You must specify the position and size of the section to be mapped at the beginning. In the example, the first 256 bytes are mapped. If the file does not exist, a 256-byte file is created. If the file exists and is smaller, it is expanded to 256 bytes. If the file is larger, its size and contents after the first 256 bytes remain unchanged.
  • A maximum of 2 GB can be mapped into memory. When the MappedByteBuffer was introduced with Java 1.4 in 2002, Java developers apparently could not imagine that today almost every developer laptop is equipped with 16 to 32 GB RAM. Up to and including Java 15 (early access), this limit has not been increased.
  • MappedByteBuffer does not implement the Closeable interface. Therefore, in the example above, we cannot create it within the try block. There is also no method to “un-map” it manually. If we tried to delete the file at the end of the example above, we would get an AccessDeniedException in most cases. The MappedByteBuffer is removed by the garbage collector when it is no longer needed. To “un-map” the file, it registers a so-called “Cleaner”, which is invoked when the MappedByteBuffer is only “phantom reachable”. In the code of the performance test described below, you can find a hack using sun.misc.Unsafe to un-map the file manually.

Creating a MappedByteBuffer from a FileInputStream / FileOutputStream

We have seen earlier that we can also create a FileChannel using FileInputStream.getChannel() or FileOutputStream.getChannel(). What happens if we try to map such a channel into memory?

Since that FileChannel is practically independent of FileInputStream or FileOutputStream, the following is possible without problems:

var fis = new FileInputStream(fileName);
var channel = fis.getChannel();
var map = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size());

The following, however, does not work:

var fos = new FileOutputStream(fileName);
var channel = fos.getChannel();
var map = channel.map(FileChannel.MapMode.READ_WRITE, 0, channel.size());

Das Kommando channel.map() führt hierbei zu einer NonReadableChannelException, da der durch FileOutputStream.getChannel() erzeugte FileChannel nur ein Schreiben erlaubt – der MapMode.READ_WRITE hingegen auch einen Lesezugriff erfordert. Einen MapMode.WRITE_ONLY gibt es nicht.

The command channel.map() results in a NonReadableChannelException because the FileChannel created by FileOutputStream.getChannel() allows only write access – MapMode.READ_WRITE however, requires read access. There is no MapMode.WRITE_ONLY.

How to lock files and file sections

For more complex applications (e.g., a file or database server), you may want to access the same file from different threads or even processes. Therefore, entire files or file sections that are being written to must be locked so that no other threads or processes can access them at the same time.

Locking is directly supported by the operating system and the file system, so that this also works between different Java programs or between Java programs and any other processes on the same system – or when using shared storage – e.g., a network drive – also between processes on different systems.

A distinction is made between shared locks (“read locks”) and exclusive locks (“write locks”). If one process holds an exclusive lock on a file section, no other process can get a lock on the same or an overlapping file section – neither an exclusive nor a shared lock. If one process holds a shared lock, other processes can also get shared locks to the same or overlapping file sections.

You can set a lock with the following methods:

  • FileChannel.lock(position, size, shared) – this method waits until a lock of the requested type (shared = true → shared; shared = false → exclusive) can be set for the file section specified by position and size.
  • FileChannel.lock() – this method waits until an exclusive lock can be set for the entire file.
  • FileChannel.tryLock(position, size, shared) – this method tries to set a lock of the requested type for the specified file section. If a lock cannot be obtained, the method does not wait but returns null instead.
  • FileChannel.tryLock() – this method tries to set an exclusive lock for the whole file. If this is not possible, it returns null.

If the lock is successfully set, the methods return a FileLock object. You can release the lock using its release() or close() method. Here is a simple example that sets an exclusive lock on the entire file and then writes 1,000 random bytes:

Path path = Path.of("lock-demo.bin");

byte[] bytes = new byte[1000];
ThreadLocalRandom.current().nextBytes(bytes);

try (FileChannel channel = FileChannel.open(path,
        StandardOpenOption.CREATE, StandardOpenOption.WRITE);
     FileLock lock = channel.lock()) {
    ByteBuffer buffer = ByteBuffer.wrap(bytes);
    channel.write(buffer);
}

Performance Tests

I wrote a program to measure the performance of different write methods – at different buffer and file sizes.

In order to obtain a result that is as free of side effects as possible, I repeated each test 32 times and then determined the median. I created files from 1 MB to 1 GB in size and provided between 1 KB and 1 MB for the ByteBuffer.

All tests are performed without force(). I want to test the speed at which the data is transferred to the operating system, not the speed of the storage media.

You are welcome to clone the test program from my GitLab-Repository and run it on your system.

The test program also measures the write speed of those FileChannels that are created via RandomAccessFile.getChannel() and FileOutputStream.getChannel(). However, since the test results are almost identical to those of FileChannel.open(), I do not show them in the following sections.

Test results

The test results are too extensive to be printed here in full. You can have a look at them in this Google Document.

First and foremost, the writing speed depends on the type of access – sequential or random access. Interestingly, buffer and file size also have a significant impact on the result.

Test results for sequential write access

With sequential write access, speed increases continuously up to a file size of 128 MB; after that, it stagnates or decreases. I suspect that from this size on, the operating system starts to write data to the storage medium, so from here on, its speed is included in the measurement results. Therefore, I only show results up to a file size of 128 MB.

The following four diagrams show the write speed in relation to the buffer size for file sizes of 1 MB, 8 MB, 16 MB, and 128 MB.

Sequential file write speed for 1 MB files
Sequential file write speed for 1 MB files
Sequential file write speed for 8 MB files
Sequential file write speed for 8 MB files
Sequential file write speed for 16 MB files
Sequential file write speed for 16 MB files
Sequential file write speed for 128 MB files
Sequential file write speed for 128 MB files

Test results for sequential write access – Analysis

For files up to 8 MB in size, memory-mapped files are fastest, regardless of the buffer size.

For 16 MB files, this is only valid up to a buffer size of 16 KB. With a buffer size of 32 KB or more, a FileChannel with a native ByteBuffer is faster. With a file size of 128 MB, FileChannel is faster already at a buffer size of 16 KB.

The native ByteBuffer is up to 20% faster than the ByteBuffer on the Java heap. The larger the file and buffer, the higher the performance gain from the native buffer.

Bis zu einer Buffergröße von 8 KB ist der FileOutputStream mit dem BufferedOutputStream schneller als der FileChannel. Ab 8 KB Buffergröße sind Stream und Channel mit Heap-Buffer etwa gleich schnell. Die Grenze von 8 KB ist auf den internen 8 KB großen Buffer des BufferedOutputStream zurückzuführen. Dieser füllt erst den Buffer, bevor er die Daten in die Datei schreibt.

Up to a buffer size of 8 KB, FileOutputStream with BufferedOutputStream is faster than FileChannel. Above 8 KB buffer size, stream and channel with heap buffer are about the same speed. The limit of 8 KB is due to BufferedOutputStream‘s internal 8 KB buffer. BufferedOutputStream first fills the buffer before it writes the data to the file.

Starting from 1 MB buffer size, write speed decreases for all write methods and file sizes.

Test results for random write access

The following three diagrams show the random access write speed in relation to the buffer size for file sizes of 1 MB, 8 MB, and 128 MB. I have not written any larger files because the random write access tests generally take much longer than the sequential write access tests.

Random access file write speed for 1 MB files
Random access file write speed for 1 MB files
Random access file write speed for 8 MB files
Random access file write speed for 8 MB files
Random access file write speed for 128 MB files
Random access file write speed for 128 MB files

Test results for random write access – Analysis

With random write access, memory-mapped files are the fastest way to write files – regardless of file and buffer sizes. FileChannel follows by far; the performance gain from native buffers can reach up to 20% here as well.

Conclusion

For random write access, the choice should always be memory-mapped files.

With sequential write access, you can also work with memory-mapped files for file sizes up to 8 MB. For larger files, the best performance is achieved with FileChannel and a direct ByteBuffer with a minimum size of 16 KB and a maximum size of 512 KB.

Of course, these are only rough guidelines, derived from measurement results on my system. If you want to tune the performance down to the last MB/s, I recommend testing different write methods and buffer sizes for your specific use case.

Summary

In today’s article, I showed you what FileChannel and ByteBuffer are and how to read and write files with them. You learned what memory-mapped files are and how to set locks on file sections so that other processes cannot write to them simultaneously.

This concludes the six-part series about files in Java. If you liked this article (or the whole series), feel free to share it using one of the share buttons below. If you would like to be informed about new articles, you can join my mailing list by filling out the form below.

I’d like to know from you: What part of this series did you find most helpful or most enjoyable? Leave me a comment!

You might also like the following articles
Leave a Comment

Your email address will not be published. Required fields are marked *

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}