sizeof
In the programming languages C and C++, the unary operator 'sizeof' is used to calculate the sizes of datatypes. sizeof
can be applied to all datatypes, be they primitive types such as the integer and floating-point types defined in the language, pointers to memory addresses, or the compound datatypes (unions, structs, or C++ classes) defined by the programmer. sizeof
is an operator that returns the size in bytes of the variable or parenthesized type-specifier that it precedes. In C standards preceding C99 it was a compile-time operator.
Need for sizeof
In many programs, there are situations where it is useful to know the size of a particular datatype (one of the most common examples is dynamic memory allocation using the library function malloc
). Though for any given implementation of C or C++ the size of a particular datatype is constant, the sizes of even primitive types in C and C++ are implementation defined (that is, not precisely defined by the standard). This can cause problems when trying to allocate a block of memory of the appropriate size. For example, say a programmer wants to allocate a block of memory big enough to hold ten variables of type int
. Because our hypothetical programmer doesn't know the exact size of type int
, the programmer doesn't know how many bytes to ask malloc
for. Therefore, it is necessary to use the operator sizeof
:
int *pointer; /*pointer to type int, used to reference our allocated data*/
pointer = malloc(sizeof(int) * 10);
In the preceding code, the programmer instructs malloc
to allocate and return a pointer to memory. The size of the block allocated is equal to the number of bytes a single object of type int
takes up, multiplied by 10, ensuring enough space for all 10 int
s.
It is generally not safe for a programmer to assume he or she knows the size of any datatype. For example, even though most implementations of C and C++ on 32-bit systems define type int
to be 4 bytes, it is recommended by many programmers to always use sizeof
, as the size of an int
could change when code is ported to a different system, breaking the code. In addition, it is frequently very difficult to predict the sizes of compound datatypes such as a struct
or union
due to structure "padding" (see "implementation" below). Another reason for using sizeof is readability, as this avoids magic numbers.
Use
The 'sizeof' operator is used to determine the amount of space any data-element/datatype occupies in memory. To use sizeof
, the keyword "sizeof
" is followed by a type name, variable, or expression. If a type name is used, it always needs to be enclosed in parentheses, whereas variable names and expressions can be specified with or without parentheses. A sizeof
expression evaluates to an unsigned value equal to the size in bytes of the "argument" datatype, variable, or expression (with datatypes, sizeof
evaluates to the size of the datatype; for variables and expressions it evaluates to the size of the type of the variable or expression). For example, assuming int
s are 4 bytes long, the following code will print 1,4:
/* the following code illustrates the use of sizeof
* with variables and expressions (no parentheses needed),
* and with type names (parentheses needed)
*/
char c;
printf("%zu,%zu", sizeof c, sizeof(int));
The value of a sizeof
expression is always non-negative as the C standard specifies that the type of such an expression is size_t
, defined to be an unsigned integer type. The z
prefix should be used to print it, because the actual size can differ on each architecture.
Using sizeof
with arrays
When sizeof
is applied to an array, the result is the size in bytes of the array in memory. The following program uses sizeof
to determine the size of an array, avoiding a buffer overflow when copying characters:
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv)
{
char buffer[10]; /* Array of 10 chars */
/* Only copy 9 characters from argv[1] into buffer.
* sizeof(char) is defined to be 1, so the number of
* elements in buffer is equal to its size in bytes.
*/
strncpy(buffer, argv[1], sizeof(buffer) - 1);
/* Set the last element of the buffer equal to null */
buffer[sizeof(buffer) - 1] = '\0';
return 0;
}
Here, sizeof buffer
is equivalent to 10*sizeof(char)
, or 10.
C99 adds support for flexible array members to structures. This form of array declaration is allowed as the last element in structures only, and differs from normal arrays in that no length is specified to the compiler:
#include <stdio.h>
struct flexarray
{
char val;
char array[]; /* Flexible array member; must be last elemnent of struct */
};
int main(int argc, char **argv)
{
printf("sizeof(struct flexarray) = %zu\n", sizeof(struct flexarray));
return 0;
}
In this case the sizeof
operator returns the size of the structure, including any padding, but without any storage allowed for the array. In the above example, the following output will be produced:
sizeof(struct flexarray) = 1
For structures containing flexible array members, sizeof
is therefore equivalent to
offsetof(s, array)
where s
is the structure name and array
is the flexible array member.
C99 also allows variable length arrays where the length is specified at runtime [1]. In such cases the sizeof operator is evaluated in part at runtime to determine the storage occupied by the array.
#include <stddef.h>
size_t flexsize(int n)
{
char b[n+3]; /* Variable length array */
return sizeof b; /* Execution time sizeof */
}
int main()
{
size_t size;
size = flexsize(10); /* flexsize returns 13 */
return 0;
}
sizeof
and incomplete types
sizeof
can only be applied to "completely" defined types. With arrays, this means that the dimensions of the array must be present in its declaration, and that the type of the elements must be completely defined. For struct
s and union
s, this means that there must be a member list of completely defined types. For example, consider the following two source files:
/* file1.c */
int arr[10];
struct x {int one; int two;};
/* more code */
/* file2.c */
extern int arr[];
struct x;
/* more code */
Both files are perfectly legal C, and code in file1.c can apply sizeof
to arr
and struct x
. However, it is illegal for code in file2.c to do this, because the definitions in file2.c are not complete. In the case of arr
, the code does not specify the dimension of the array; without this information, the compiler has no way of knowing how many elements are in the array, and cannot calculate the array's overall size. Likewise, the compiler cannot calculate the size of struct x
because it does not know what members it is made up of, and therefore cannot calculate the sum of the sizes of the structure's members. If the programmer provided the size of the array in its declaration in file2.c, or completed the definition of struct x
by supplying a member list, this would allow him to apply sizeof
to arr
and struct x
in that source file.
Implementation
It is the responsibility of the compiler's author to implement the sizeof
operator in a way specific and correct for a given implementation of the language. The sizeof
operator must take into account the implementation of the underlying memory allocation scheme to obtain the sizes of various datatypes. sizeof
is usually a compile-time operator, which means that during compilation, sizeof
and its operand get replaced by the result-value. This is evident in the assembly language code produced by a C or C++ compiler. For this reason, sizeof
qualifies as an operator, even though its use sometimes looks like a function call. Applying sizeof
to dynamic arrays, introduced in C99 is an exception to this rule.
Structure padding
Main article: data structure alignment
To calculate the sizes of user-defined types, the compiler takes into account any alignment space needed for complex user-defined data structures. This is why the size of a structure in C can be greater than the sum of the sizes of its members. For example, on many systems, the following code will print 8:
struct student{
char grade; /* char is 1 byte long */
int age; /* int is 4 bytes long */
};
printf("%zu", sizeof (struct student));
The reason for this is that most compilers, by default, align complex data-structures to a word alignment boundary. In addition, the individual members are also aligned to their respective alignment boundaries. By this logic, the structure student
gets aligned on a word boundary and the variable age
within the structure is aligned with the next word address. This is accomplished by way of the compiler inserting "padding" space between two members or to the end of the structure to satisfy alignment requirements. This padding is inserted to align age
with a word boundary. (Most processors can fetch an aligned word faster than they can fetch a word value that straddles multiple words in memory, and some don't support the operation at all[2]).
See also
References
- ↑ "WG14/N1124 Committee Draft — May 6, 2005 ISO/IEC 9899" 6.5.3.4 The sizeof operator
- ↑ Rentzsch, Jonathan. "Data alignment: Straighten up and fly right." www.ibm.com. 08 FEB 2005. Accessed 1 Oct 2006
If you like SEOmastering Site, you can support it by - BTC: bc1qppjcl3c2cyjazy6lepmrv3fh6ke9mxs7zpfky0 , TRC20 and more...