Shared Memory
As we have seen, many methods were created in order to let processes
communicate. All this communications is done in order to share data. The problem
is that all these methods are sequential in nature. What can we do in order to
allow processes to share data in a random-access manner?
A D V E R T I S E M E N T
Shared memory comes to the rescue. As you might know, on a Unix system, each
process has its own virtual address space, and the system makes sure no process
would access the memory area of another process. This means that if one process
corrupts its memory's contents, this does not directly affect any other process
in the system.
With shared memory, we declare a given section in the memory as one that will
be used simultaneously by several processes. This means that the data found in
this memory section (or memory segment) will be seen by several processes. This
also means that several processes might try to alter this memory area at the
same time, and thus some method should be used to synchronize their access to
this memory area (did anyone say "apply mutual exclusion using a semaphore" ?).
Background - Virtual Memory Management Under Unix
In order to understand the concept of shared memory, we should first check
how virtual memory is managed on the system.
In order to achieve virtual memory, the system divides memory into small
pages each of the same size. For each process, a table mapping virtual memory
pages into physical memory pages is kept. When the process is scheduled for
running, its memory table is loaded by the operating system, and each memory
access causes a mapping (by the CPU) to a physical memory page. If the virtual
memory page is not found in memory, it is looked up in swap space, and loaded
from there (this operation is also called 'page in').
When the process is started, it is being allocated a memory segment to hold
the runtime stack, a memory segment to hold the programs code (the code
segment), and a memory area for data (the data segment). Each such segment might
be composed of many memory pages. When ever the process needs to allocate more
memory, new pages are being allocated for it, to enlarge its data segment.
When a process is being forked off from another process, the memory page
table of the parent process is being copied to the child process, but not the
pages themselves. If the child process will try to update any of these pages,
then this page specifically will be copied, and then only the copy of the child
process will be modified. This behavior is very efficient for processes that
call fork() and immediately use the exec() system call
to replace the program it runs.
What we see from all of this is that all we need in order to support shared
memory, is to some memory pages as shared, and to allow a way to identify them.
This way, one process will create a shared memory segment, other processes will
attach to them (by placing their physical address in the process's memory pages
table). From now all these processes will access the same physical memory when
accessing these pages, thus sharing this memory area.
Allocating A Shared Memory Segment
A shared memory segment first needs to be allocated (created), using the
shmget() system call. This call gets a key for the segment (like
the keys used in msgget() and semget()), the desired
segment size, and flags to denote access permissions and whether to create this
page if it does not exist yet. shmget() returns an identifier that
can be later used to access the memory segment. Here is how to use this call:
/* this variable is used to hold the returned segment identifier. */
int shm_id;
/* allocate a shared memory segment with size of 2048 bytes, */
/* accessible only to the current user. */
shm_id = shmget(100, 2048, IPC_CREAT | IPC_EXCL | 0600);
if (shm_id == -1) {
perror("shmget: ");
exit(1);
}
If several processes try to allocate a segment using the same ID, they will all
get an identifier for the same page, unless they defined IPC_EXCL
in the flags to shmget(). In that case, the call will succeed only
if the page did not exist before.
Attaching And Detaching A Shared Memory Segment
After we allocated a memory page, we need to add it to the memory page table
of the process. This is done using the shmat() (shared-memory
attach) system call. Assuming 'shm_id' contains an identifier returned by a call
to shmget(), here is how to do this:
/* these variables are used to specify where the page is attached. */
char* shm_addr;
char* shm_addr_ro;
/* attach the given shared memory segment, at some free position */
/* that will be allocated by the system. */
shm_addr = shmat(shm_id, NULL, 0);
if (!shm_addr) { /* operation failed. */
perror("shmat: ");
exit(1);
}
/* attach the same shared memory segment again, this time in */
/* read-only mode. Any write operation to this page using this */
/* address will cause a segmentation violation (SIGSEGV) signal. */
shm_addr_ro = shmat(shm_id, NULL, SHM_RDONLY);
if (!shm_addr_ro) { /* operation failed. */
perror("shmat: ");
exit(1);
}
As you can see, a page may be attached in read-only mode, or in read-write mode.
The same page may be attached several times by the same process, and then all
the given addresses will refer to the same data. In the example above, we can
use 'shm_addr' to access the segment both for reading and for writing, while
'shm_addr_ro' can be used for read-only access to this page. Attaching a segment
in read-only mode makes sense if our process is not supposed to alter this
memory page, and is recommended in such cases. The reason is that if a bug in
our process causes it to corrupt its memory image, it might corrupt the contents
of the shared segment, thus causing all other processes using this segment to
possibly crush. By using a read-only attachment, we protect the rest of the
processes from a bug in our process.
Placing Data In Shared Memory
Placing data in a shared memory segment is done by using the pointer returned
by the shmat() system call. Any kind of data may be placed in a
shared segment, except for pointers. The reason for this is simple: pointers
contain virtual addresses. Since the same segment might be attached in a
different virtual address in each process, a pointer referring to one memory
area in one process might refer to a different memory area in another process.
We can try to work around this problem by attaching the shared segment in the
same virtual address in all processes (by supplying an address as the second
parameter to shmat(), and adding the SHM_RND flag to
its third parameter), but this might fail if the given virtual address is
already in use by the process.
Here is an example of placing data in a shared memory segment, and later on
reading this data. We assume that 'shm_addr' is a character pointer, containing
an address returned by a call to shmat().
/* define a structure to be used in the given shared memory segment. */
struct country {
char name[30];
char capital_city[30];
char currency[30];
int population;
};
/* define a countries array variable. */
int* countries_num;
struct country* countries;
/* create a countries index on the shared memory segment. */
countries_num = (int*) shm_addr;
*countries_num = 0;
countries = (struct country*) ((void*)shm_addr+sizeof(int));
strcpy(countries[0].name, "U.S.A");
strcpy(countries[0].capital_city, "Washington");
strcpy(countries[0].currency, "U.S. Dollar");
countries[0].population = 250000000;
(*countries_num)++;
strcpy(countries[1].name, "Israel");
strcpy(countries[1].capital_city, "Jerusalem");
strcpy(countries[1].currency, "New Israeli Shekel");
countries[1].population = 6000000;
(*countries_num)++;
strcpy(countries[1].name, "France");
strcpy(countries[1].capital_city, "Paris");
strcpy(countries[1].currency, "Frank");
countries[1].population = 60000000;
(*countries_num)++;
/* now, print out the countries data. */
for (i=0; i < (*countries_num); i++) {
printf("Country %d:\n", i+1);
printf(" name: %s:\n", countries[i].name);
printf(" capital city: %s:\n", countries[i].capital_city);
printf(" currency: %s:\n", countries[i].currency);
printf(" population: %d:\n", countries[i].population);
}
A few notes and 'gotchas' about this code:
- No usage of malloc().
Since the memory page was already allocated when we called shmget(),
there is no need to use malloc() when placing data in that
segment. Instead, we do all memory management ourselves, by simple pointer
arithmetic operations. We also need to make sure the shared segment was
allocated enough memory to accommodate future growth of our data - there are
no means for enlarging the size of the segment once allocated (unlike when
using normal memory management - we can always move data to a new memory
location using the realloc() function).
- Memory alignment.
In the example above, we assumed that the page's address is aligned properly
for an integer to be placed in it. If it was not, any attempt to try to
alter the contents of 'countries_num' would trigger a bus error (SIGBUS)
signal. further, we assumed the alignment of our structure is the same as
that needed for an integer (when we placed the structures array right after
the integer variable).
- Completeness of the data model.
By placing all the data relating to our data model in the shared memory
segment, we make sure all processes attaching to this segment can use the
full data kept in it. A naive mistake would be to place the countries
counter in a local variable, while placing the countries array in the shared
memory segment. If we did that, other processes trying to access this
segment would have no means of knowing how many countries are in there.
Destroying A Shared Memory Segment
After we finished using a shared memory segment, we should destroy it. It is
safe to destroy it even if it is still in use (i.e. attached by some process).
In such a case, the segment will be destroyed only after all processes detach
it. Here is how to destroy a segment:
/* this structure is used by the shmctl() system call. */
struct shmid_ds shm_desc;
/* destroy the shared memory segment. */
if (shmctl(shm_id, IPC_RMID, &shm_desc) == -1) {
perror("main: shmctl: ");
}
Note that any process may destroy the shared memory segment, not only the one
that created it, as long as it has write permission to this segment.
|