Skip to main content

I/O in Java - Basics (Part 1)

As the name suggests, Java I/O is used to handle I/O operations in an application. Most applications need to process some inputs and produce output based on that input. In this post, we will see how things with respect to I/O work at the machine level.

To understand these concepts in more detail, we should familiarize ourselves with the basic terminology -

Buffers

It is a region in the physical memory storage which is used to temporarily store data while it is moved from one place to another. The purpose of most buffers is to act as a holding area, enabling CPU to manipulate data before transferring it to a device.

Because the processes of reading and writing data to the disk are relatively slow, many programs keep track of data changes in a buffer and then copy the buffers to the disk.

For e.g., word processors employ a buffer to keep track of changes to files. When we save the file, the word processor updates the disk file with the contents of the buffer. Clearly, this is much more efficient than accessing disk file each time we make changes to the file.

Since our changes are initially stored in the buffer, not on the disk, all of them will be lost if the computer fails during an editing session. For this reasons, it is a good practice to save our file periodically.

Kernel

This part of the OS loads first and remains in the main memory. It is usually located in the protected area of memory to prevent it from being overwritten by other parts of the OS.

Typically, it is responsible for memory management, process and task management, and disk management. The kernel connects the system hardware to the application software.

User Space

It is the portion of the system memory where user applications run. This contrasts with the kernel space which is the memory allocated to the kernel and the OS. Separating user and kernel space protects the system from errant processes which may use memory required by OS. 

Buffer Handling and Kernel vs User Space

Buffers, and how they are handled make the basis of I/O. Input/Output means nothing more than moving data in and out of buffers. 

Write operation - requesting OS that data to be drained from the buffer.

Read operation - requesting OS that buffer to be filled with data.

  1. The process requests that its buffer should be filled by making the read() call.
  2. This call causes the kernel to issue a command to the disk controller hardware to fetch the data from disk.
  3. The disk controller hardware writes data directly into a kernel memory buffer.
  4. Once the disk controller finishes putting data into the buffer, the kernel copies the data from the temporary buffer in kernel space to the buffer specified by the process. 
An important thing to note here is that kernel tries to cache data, so the data requested may already be available in the kernel space. If so, then the data is copied out. If not, then the process is suspended while the kernel goes and bring data into the memory.

Virtual Memory

It is a storage scheme that gives an illusion of a very large main memory. This is achieved by treating a part of secondary memory as the main memory. Thus, the user can load large processes in the virtual memory.

Instead of loading one big process, the OS loads different chunks of multiple processes in the main memory. Due to this, both degree of multiprogramming and CPU utilization increase. 

In the virtual memory scheme, whenever some pages need to be loaded for execution and enough main memory is not available, instead of stopping the pages from entering into the main memory, the OS searches for the RAM area that is least used in recent times or that is not referenced and copy that into the secondary memory to make the spaces for new pages in the main memory.

Virtual Memory provides two main advantages -

  1. More than one virtual address can refer to the same physical memory location - copying from kernel space to the user buffer is an overhead, thus the disk controller directly send the data to the user buffer.
  2. A virtual memory space can be larger than the actual hardware memory available - virtual memory paging.

Memory Paging

In this, the virtual pages are persisted in the external disk. This makes space for the additional virtual pages. Thus, the physical memory acts as a cache for such pages. 

Making memory page sizes as multiple of the disk block size allows the kernel to issue direct commands to the disk controller hardware to read and write. All disk I/O is done at the page level and this is the only way the data ever moves between the disk and physical memory.

You can read more in detail here.

File/Block I/O

The file system is very different from the disk. The disk does not know anything about the semantics of a file. A disk store data in sectors (usually 512 bytes each) which are nothing but slots to store data. In this respect, the sectors of a disk are similar to memory pages, all are of uniform size and addressable as a large array.

 On the other hand, file system stores data in a disk in an arranged manner. Our application code never interacts with the disk directly but with the file system.

When user program requests a certain data, the file system implementation determines exactly where on disk that data resides. It then brings data into the main memory.

The file system data is also cached like other memory pages. On subsequent I/O requests, data may be present in the physical memory and can be reused without rereading from the disk.

Stream I/O

The bytes of an I/O stream must be accessed sequentially. Network connections are a common example of streams. 

Streams are generally slower than the block devices and often the source of intermittent input. Most OS allow streams to be placed into non-blocking mode, which allows a process to check if the input is available on stream without getting stuck if none is available at the moment. Such a capability allows a process to handle input as it arrives but perform other functions while the input stream is idle.

Conclusion

Pheww! such a complex topic with loads of technical terms. Today we discussed lower level details of I/O which will lay the foundation of I/O in java in further posts. 

I would love to hear your thoughts on this and would like have suggestions from you to make it better. 

Feel free to befriend me on FacebookTwitter or Linked In or say Hi by email.

Happy Coding 😊

Comments

Popular posts from this blog

Threads in Java - Masterclass (Part 0)

Multithreading is a way to introduce concurrency in a program. In any case, if there are parallel paths in our program (parts which do not depend on the result from another part), we can make use of multithreading.
One should exploit this feature, especially with all these multiple core machines nowadays.

Below are a few reasons why we should use multithreading -
1. Keep a process responsive There was once a time when you would print a document in MS Word and the application would freeze for an annoyingly long amount of time until the job finished. Eventually, Microsoft solved this problem by running a printing job parallel to the main thread/ GUI thread.  To be clear though, not only GUI apps but Network services have to keep an ear to the ground for new clients, dropped connections and cancellation requests. In either case, it is critical to do the heavy lifting on a secondary thread to keep the user satisfied. 2. Keep a processor busy Keeping a processor busy can be a tough task e…

Parsing XML using Retrofit

Developing our own type-safe HTTP library to interface with a REST API can be a real pain as we have to handle many aspects -
making connectionscachingretrying failed requeststhreadingresponse parsingerror handling, and more.  Retrofit, on the other hand, is a well-planned, documented and tested library that will save you a lot of precious time and headaches. In this tutorial, we are going to discuss how we can parse the XML response returned from https://timesofindia.indiatimes.com/rssfeedstopstories.cms using the Retrofit library.

To work with Retrofit, we need three classes -  Model class to map the JSON dataInterfaces which defines the possible HTTP operationsRetrofit.Builder class - Instance which uses the interface and the Builder API which allows defining the URL endpoint for the HTTP operation. Every method of the above interface represents on possible API call. The request type is specified by using appropriate annotations (GET, POST). The response is returned as a Call object…

Material design profile page in Android

Hey everyone, some days back I was working on one my personal Android project. In that project, I was supposed to create a simple profile page for a user. This profile page was supposed to show some basic details of a user.

The output of this UI will be like this -
I created the profile page using material design and in this post, I am going to discuss a step by step tutorial to create a simple yet elegant profile page. Without further ado, let's get started.
Creating a new project Click on File ➤ New Project ➤ Empty Activity and fill the necessary details. Change styles.xml fileNavigate to app\src\main\res\values\styles.xmlChange the style value from DarkActionBar to NoActionBar as below<resources><!-- Base application theme. --><stylename="AppTheme"parent="Theme.AppCompat.Light.NoActionBar"><!-- Customize your theme here. --><itemname="colorPrimary">@color/colorPrimary</item><itemname="colorPrimaryDark&qu…