class: title, smokescreen, shelf, bottom, no-footer background-image: url(images/ptr-memory.png) # 181U Spring 2020 ### C Arrays, Strings, and Pointers --- layout: true .footer[ - Geoffrey Brown, 2020 - 181U ] <style> h1 { border-bottom: 8px solid rgb(32,67,143); border-radius: 2px; width: 90%; } .smokescreen h1 { border-bottom: none; } .small.remark-slide-content.compact {font-size:1.2rem} .smaller.remark-slide-content.compact {font-size:1.1rem} .small-code.remark-slide-content.compact code {font-size:1.0rem} .very-small-code.remark-slide-content.compact code {font-size:0.9rem} .line-numbers{ /* Set "line-numbers-counter" to 0 */ counter-reset: line-numbers-counter; } .line-numbers .remark-code-line::before { /* Increment "line-numbers-counter" by 1 */ counter-increment: line-numbers-counter; content: counter(line-numbers-counter); text-align: right; width: 20px; border-right: 1px solid #aaa; display: inline-block; margin-right: 10px; padding: 0 5px; } </style> --- class: compact # Agenda * Memory Layout * Bits, bytes * Memory layout of C types * Alignment * Endianess * Variables * Pointers * Arrays * Strings --- class: compact # Word Bit Numbering ![](images/space.png# w-10pct) ![](images/bitnumbering.png# w-80pct) --- class: compact # Hexadecimal Encoding ![](images/space.png# w-33pct) ![](images/hextable.png# w-30pct) --- class: compact # Word Size * A computer's *word size* is commonly the number of bits in a memory address or the size of an integer. There is no standard definition. * Examples - Pentium (IA32) -- 32 bits - IA64 -- 64 bits - ARM Cortex -- 32 bits - msp430 -- 16 bits --- class: compact # Word Size Matters * The memory address range: \\(0..2^n\, -\\). * Primitive integer range: - signed integers \\(-2^{n-1}..2^{n-1} - 1\\) - unsigned integers \\(0..2^n -1 \\) - operations on larger sizes require multiple instructions --- class: compact # C Data type sizes (in bytes) for Various Architectures | C Data Type | IA64 | ARM 32bit | Required | | ----------- | ---- | --------- | -------- | | char | 1 | 1 | minimum size to hold a character | | short| 2 | 2 | at least 2 bytes | | int | 4 | 4 | at least 4 bytes | | long int | 8 | 4 | at least 4 bytes | | long long | 8 | 8 | at least 8 bytes | | float | 4 | 4 | commonly 4 bytes (IEEE) | | double | 8 | 8 | commonly 8 bytes (IEEE) | | void * | 8 | 4 | machine dependent | --- class: compact # Byte Addressable Memory ![](images/space.png# w-10pct) ![](images/bytememory.png# w-60pct) --- class: compact # Byte Order -- little endian ![](images/le-order.png# w-90pct) --- class: compact # Byte Order -- big endian ![](images/be-order.png# w-90pct) --- class: compact # Byte Swapping ![](images/space.png# w-20pct) ![](images/byteswap.png# w-50pct) --- class: compact # Memory Alignment * int8_t : any address * int16_t : any even address * int32_t, float : any address divisible by four * int64_t, double : any address divisible by eight --- class: very-small-code,compact,hljs-tomorrow-night-eighties,line-numbers # Memory Alignment -- Structures ![](images/struct-linear.png# w-2-12th fr) * Fields of structure layed out sequentially * Every field aligned on its required boundary * Structure alignment is same as most restrictive field * Size of structure is multiple of most restrictive field ```C struct example { int a; short b; long long c; }; ``` --- class: very-small-code,compact,hljs-tomorrow-night-eighties,line-numbers # Memory Alignment -- Unions ![](images/union-linear.png# w-30pct fr) * Fields of union layed out in same region * Every field aligned on its required boundary * Union alignment is same as most restrictive field * Size of union is size of most restrictive field ```C union example { int a; short b; long long c; }; ``` --- class: compact # C Memory Model (simplified) ![](images/memmodel.png# w-40pct fr) * Global and static variables in Data *section* * Machine instructions and constants in Code *section* * Local variables in Stack *section* * malloc'd variables in Heap (ignore for now) What is ignored * Shared libraries -- this is a subject for operating systems course * Relative positions of segments may differ in "real" system. * Memory mapping -- subject for OS course --- class: very-small-code,compact,hljs-tomorrow-night-eighties,line-numbers # Variables A variable is state with a name -- when the variable is created, space is allocated for it in memory. ```C int k; ``` Defines a variable named **k** of type **int**. On many machines, this requires 4 bytes of memory. We can modify the value of **k** in our program: ```C k = 33; ``` The effect of executing this statement is to modify the memory allocated to **k** by changing its value to the binary representation of 33. Thus the *name* of a variable is associated, by the compiler, with an address in memory. --- class: very-small-code,compact,hljs-tomorrow-night-eighties,line-numbers # Variables ```C int j, k; k = 2; // assign memory of k value 2 j = 7; // assign memory of j value 7 k = j; // copy from memory of j to memory of k ``` --- class: compact,small # Differences between C and Java A variable is a name that is associated with a block of memory * In C: * Globals are allocated space by the linker in the data section * Locals are allocated space at runtime on the stack * In Java: * Allocation is mostly in heap * Allocation may be moved due to garbage collection * In C: * You can access the underlying memory of a variable * You can cast this memory as another type * You can "retrieve" the address of the memory allocated to a variable * In Java: * You cannot access the underlying memory * You can pass a reference, but that is *opaque* in that it's correspondence to the underlying representation of a variable is hidden. --- class: very-small-code,compact,hljs-tomorrow-night-eighties,line-numbers # Variables, Addresses, Pointers ![](images/ptr-memory.png# w-50pct fr) * `int* ` is the type of a *pointer* to an integer variable * `&i` is the referencing operator that returns the address of variable `i` ```C int i = 123; // create a variable int* p = &i; // place the address of i in a new variable ``` --- class: very-small-code,compact,hljs-tomorrow-night-eighties,line-numbers # Creating and using pointers A pointer type is defined by appending a `*` to a regular type: ```C int* iptr; float* fptr; char* cptr; ``` We can capture the address of the memory allocated to a variable with the `&` operator: ```C int i; float f; char c; iptr = &i; fptr = &f; cptr = &c; ``` The address of a variable is the memory address of the first byte of the memory allocated to it. --- class: very-small-code,compact,hljs-tomorrow-night-eighties,line-numbers # Variables, Addresses, Pointers `*` is used in several ways * As a type * To read the value of a variable through a pointer * To write a value to a variable through a pointer ```C float f = 3.14159; // create a variable f float* fptr = &f; // get the address of f float f2 = *fptr; // copy value of f to f2 *fptr = 4.5; // change the value of f ``` When `*` is used to read or write the value of a variable through a pointer we call that *dereferencing* the pointer. The pointer itself is a *reference* --- class: very-small-code,compact,hljs-tomorrow-night-eighties,line-numbers # Passing Parameters by Reference ```C int test_and_dec(int* ip) { if (*ip > 0) { *ip -= 1; return 1; } else { return 0; } } ... int i = 123;; // a global variable ... if (test_and_dec(&i)) { // i is less than before, possibly 0 } else { // i was already <= 0 } ``` --- class: very-small-code,compact,hljs-tomorrow-night-eighties,line-numbers # Passing Parameters by Reference We can implement a "swap" operation with pointers ```C void swap(int *a, int *b) { int temp = *a; *a = *b; *b = temp; } ``` --- class: very-small-code,compact,hljs-tomorrow-night-eighties,line-numbers # Passing Parameters by Reference C's **scanf** library function requires that its parameters be passed by reference -- it is the "dual" of **printf** in reading from input into variables ```C int i; float f; char name[50]; scanf("%d %f %s", &i, &f, name); ``` Notice that we had to take pointers to **i** and **f** but not **name**. As we shall see, array names are already references. --- class: compact,hljs-tomorrow-night-eighties,line-numbers # Pointer types and Arrays There is no difference between a pointer to a scalar variable and an array of the scalar variable type ! ```C int i; int j[10]; int* ptr = &i *ptr = 2; ptr[0] = 2; // same as *ptr = 2 !!! ptr = j; // j has address of first byte of array ptr[0] = 0; ptr[1] = 1; *(ptr + 2) = 2; // same as ptr[2] = 2 !!! ``` In the case of **i** the compiler allocates **sizeof(int)** bytes (properly aligned) and in the case of **j[]** the compiler allocates **10xsizeof(int)** bytes. --- class: compact,hljs-tomorrow-night-eighties,line-numbers,small-code # Pointer types and Arrays ```C int my_array[] = {1,23,17,4,-5,100}; // create array with 6 integers int sum = 0; for (int i = 0; i < 6; i++) { sum += my_array[i]; } // we can do the same thing with pointer arithmetic int *aptr = my_array; int sum2 = 0; for (int i = 0; i < 6; i++) { sum2 += *aptr++; } ``` --- class: compact,hljs-tomorrow-night-eighties,line-numbers,small-code # Pointers and strings Strings are null-terminated arrays of characters (char) ```C char *s = "a string" s[0] == 'a'; s[1] == ' '; ... s[7] == 'g'; s[8] == 0; s[8] == `\0`; ``` ```C char my_string[20]; my_string[0] = 'o'; my_string[1] = 'k'; my_string[2] = '\0'; ``` --- class: compact,hljs-tomorrow-night-eighties,line-numbers,very-small-code,small # Pointers and strings The null-termination is important ! It's also an major source of bugs in real programs. Consider the following copy function ```C char *my_strcpy(char *dest, char *src) { char *p = dest; while (*src) { *p++ = *src++; } *p = 0; return dest; } ``` So many things could go wrong * **src** isn't null terminated * **dest** doesn't reference a sufficiently large block of memory The net result is that memory is overwritten that shouldn't be -- if you're lucky the program crashes. --- class: compact,hljs-tomorrow-night-eighties,line-numbers,small-code # Pointers and strings ```C dest[i] = source[i]; ``` has the same effect as ```C *(dest + i) = *(source + i); ``` and ```C while (*source != '\0') ``` is the same as ```C while (*source) ``` --- class: compact,hljs-tomorrow-night-eighties,line-numbers,small-code # Standard string functions In **string.h** * **strlen()** : find the length of a string (doesn't count 0) * **strcat()** : concatenate strings * **strcpy()** : copy a string * **strchr()** : find location of a character in a string * **strstr()** : locate a substring in a string There are many more. --- class: compact,hljs-tomorrow-night-eighties,line-numbers,small-code # Pointers and Structures ```C struct tag { char lname[20]; char fname[20]; int age; } mytag; struct tag *st_ptr; st_ptr = &mytag; st_ptr->age = 33; printf("%s %s %d\n", st_ptr->fname, st_ptr->lname, st_ptr->age); printf("%s %s %d\n", mytag.fname, mytag.lname, mytag.age); ``` Note: The size of a structure must be a multiple of its most restrictive member so that arrays of structures are properly aligned. Pointers to unions work the same way. --- class: compact # Summary * Memory * Variables * Pointers, arrays and strings. * Cover Photo and lecture material: From https://www.cs.rochester.edu/u/ferguson/csc/c/c-for-java-programmers.pdf * Lecture material also from: https://www.cs.utexas.edu/~ans/classes/cs439/docs/dictaat.pdf