[:NitinGupta:Nitin Gupta]

[[MailTo(nitingupta910 AT gmail DOT com)]]

----
= Manages storage for variable sized data objects =
=== It is designed especially for embedded devices. ===
----
[[BR]]
This page describes the problem it tries to solve and its design details.
[[BR]]
[[BR]]
'''Problem Statement[[BR]]'''
Normally when you allocate arbitrary sized objects using kmalloc()/vmalloc() there is big space wastage due to internal fragmentation. So, if memory is at premium, tight storage is required for these variables sized data items which is what VStore does. This however comes at cost of some speed.
[[BR]][[BR]]
'''How to use it?'''[[BR]]
To store data:
{{{
vstore_write(objectID /* out */, data_to_store, len)
}}}

To restore this data:
{{{
vstore_read(objectID, buffer, len /* out */)
}}}

'''NOTE:''' In vstore_read(), make sure that buffer is big enough to store data assoc. with object 'ObjectID' - the 'len' (out) param will tell how much data was read into the buffer. Thus, to make sure that buffer size is always sufficient, either you ''know'' maximum size for objects stored (more common case) or you will have to track size of each object stored to provide buffer of right size.
[[BR]][[BR]]
== Design Details ==
----
=== Design Goals ===
 * Minimize fragmentation
 * Minimize metadata overhead
 * Fast store and restore
 * Provide dense storage - minimize spreading of chunks associated with any particular data object into multiple pages. This improves locality and hence performance.
 * Portability across different archs
[[BR]]
Each data item stored is split into multiple '''chunks''' (small contiguous physical space) and these chunks can then be spread across multiple physical pages. The Metadata associated with each chunk is stored in the beginning of chunk itself with necessary padding added to maintain alignment constraints.

Chunks can be one of three types:
 * Free chunk
 * Busy chunk with next chunk in same page
 * Busy chunk with next chunk in different page

Following gives metadata layout for each of these types of chunks:

'''Some jargon first:'''
{{{
M     = Metadata - size has to be n * ALIGN_BYTES
M(A)  = address of next chunk relative to page start (PAGE_SHIFT - ALIGN_SHIFT bits)
M(S)  = size of chunk (same as M(A))
F     = flags (2 bits)
	F(1): last chunk
	F(2): next chunk in diff page
FP    = prev chunk's flags
FN    = next chunk's flags
PFN   = PageFrameNo where next chunk exists (BITS_PER_LONG - PAGE_SHIFT bits)
OD(i) = i-th optional metadata field. Optional fields are enclosed on []
}}}


- '''Free chunk'''
{{{
M(A) | F | M(S) | free space
}}}

- '''Busy chunk type-1 (next chunk in same page)'''
{{{
M(A) | F | M(S) | [ M(S) of prev chunk ]
	- OD(1) exists if F(2) is not set
}}}

- '''Busy chunk type-2 (next chunk in different page)'''
{{{
M(A) | F | PFN | [ M(S) of prev chunk ] | [ M(S) of this chunk ]
	- only OD(1) exists if		FP(2) && !F(1)
	- only OD(2) exists if		F(1) && !FP(2)
	- both OD(1) and OD(2) exist if	FP(2) && F(1)
}}}

NOTE:
 * Space is not reserved for optional fields that do not exist. Flags let us know which fields really exist.
 * Each type of chunk has required padding at end of metadata to satisfy alignment constraint. Metadata size at beginning of each chunk is always some multiple of ALIGN_BYTES. So, access to actual data is always aligned.


Now, we can represent these three chunk types as:
{{{
#define ALIGN_SHIFT    2    /* power of 2 (min 2)
#define MA_SIZE        PAGE_SHIFT - ALIGN_SHIFT
#define MS_SIZE        MA_SIZE
}}}

- '''Representing Free chunk'''
{{{
struct free_chunk {
	unsigned long MA: MA_SIZE;
	unsigned long flags: 2;
	unsigned long MS: MS_SIZE;
} __attribute__ ((packed));
}}}

- '''Representing Busy chunk type-1 (next chunk in same page)'''
{{{
struct busy_type1_chunk {
	unsigned long MA: MA_SIZE;
	unsigned long flags: 2;
	unsigned long MS1: MS_SIZE;
	unsigned long MS2: MS_SIZE;
} __attribute__ ((packed));
}}}

- '''Representing Busy chunk type-2 (next chunk in different page)'''
{{{
struct busy_type2_chunk {
	unsigned long MA: MA_SIZE;
	unsigned long flags: 2;
	unsigned long PFN: BITS_PER_LONG - PAGE_SHIFT;
	unsigned long MS1: MS_SIZE;
	unsigned long MS2: MS_SIZE;
} __attribute__ ((packed));
}}}


'''Chunk management'''
 * A Chunk never crosses page boundary.
 * Adjacent free chunks are always merged.
 * Free lists are per page.
 * Pages are then placed in free lists according to total free space they have. Pages with maximum free space are selected first for allocation. This gives a dense storage (i.e. we avoid chunks belonging to same data item being spread across many pages).