From list_ob at gmx.net Fri Dec 3 14:28:55 2010 From: list_ob at gmx.net (Oliver Betz) Date: Fri, 03 Dec 2010 15:28:55 +0100 Subject: Control deferred writes? Message-ID: <4CF8FEA7.23858.5BCCD3BB@list_ob.gmx.net> Hallo All, can I tell gcc not to defer writes, possibly only to certain variables? In the coldfire V2 microcontrollers, consecutive register writes are slow. Consecutive writes to "off-platform" registers take 12 cycles IIRC. Less a problem if the register writes are not consecutive IOW other instructions are executed between the writes. Therefore it is detrimental to delay writes to these registers if it leads to consecutive writes. After all, it's extremly hard to debug code where everthing is out of order. And no, I sometimes can't switch off optimization for debugging. Oliver -- Oliver Betz, Muenchen From list-bastian.schick at sciopta.com Fri Dec 3 14:38:26 2010 From: list-bastian.schick at sciopta.com (42Bastian) Date: Fri, 03 Dec 2010 15:38:26 +0100 Subject: [coldfire-gnu-discuss] Control deferred writes? In-Reply-To: <4CF8FEA7.23858.5BCCD3BB@list_ob.gmx.net> References: <4CF8FEA7.23858.5BCCD3BB@list_ob.gmx.net> Message-ID: <4CF900E2.90804@sciopta.com> Am 03.12.2010 15:28, schrieb Oliver Betz: > Hallo All, > > can I tell gcc not to defer writes, possibly only to certain > variables? No, not at all. If you need such, write assembly. The compiler has no idea of the underlying hardware. It might schedule instructions if it knows the CPU core, but not w.r.t. bus timing. -- 42Bastian + | http://www.sciopta.com | Fastest direct message passing kernel. | IEC61508 certified. + From david at westcontrol.com Fri Dec 3 15:34:48 2010 From: david at westcontrol.com (David Brown) Date: Fri, 03 Dec 2010 16:34:48 +0100 Subject: [coldfire-gnu-discuss] Control deferred writes? In-Reply-To: <4CF900E2.90804@sciopta.com> References: <4CF8FEA7.23858.5BCCD3BB@list_ob.gmx.net> <4CF900E2.90804@sciopta.com> Message-ID: <4CF90E18.7040107@westcontrol.com> On 03/12/2010 15:38, 42Bastian wrote: > Am 03.12.2010 15:28, schrieb Oliver Betz: >> Hallo All, >> >> can I tell gcc not to defer writes, possibly only to certain >> variables? > > No, not at all. If you need such, write assembly. > > The compiler has no idea of the underlying hardware. > It might schedule instructions if it knows the CPU core, but not w.r.t. > bus timing. > Writes from the Coldfire V2 core are in-order - there is no re-ordering write buffer, and the data cache is write-through. Other Coldfire cores may have hardware that affects the ordering or buffering of writes. The compiler does not know the timing of writes to various places. Thus when scheduling it can only assume that writes all take a fixed number of cycles. Since you don't have to use any cpu-specific instructions to enforce or control the ordering of the writes, the only issue is to control the compiler-generated write instructions. There are three tools for doing that. One is to write at least some parts of your code in assembly, as suggested. Use of "volatile" is important. All "volatile" writes will be generated in the order expected by the program, and you will get no more nor less than you ask for. But note that non-volatile reads and writes can be re-ordered freely around the volatile reads and writes. Remember also that it is possible to enforce volatile writes to non-volatile data by a bit of (slightly messy) casting: *(volatile int32_t *)(&foo) = 123; The final tool is the memory block, usually written as: asm volatile ("" ::: "memory") This tells the compiler that any writes to memory need to be completed before "excuting" the line (it generates no code by itself), and no data read before the line can be trusted after the line (i.e., any data in registers must be re-read). Use volatile accesses, and memory blocks if needed, to enforce the write ordering that you require. Then let the compiler handle the rest as best it can. From list_ob at gmx.net Fri Dec 3 16:05:48 2010 From: list_ob at gmx.net (Oliver Betz) Date: Fri, 03 Dec 2010 17:05:48 +0100 Subject: [coldfire-gnu-discuss] Control deferred writes? In-Reply-To: <4CF900E2.90804@sciopta.com> References: <4CF8FEA7.23858.5BCCD3BB@list_ob.gmx.net>, <4CF900E2.90804@sciopta.com> Message-ID: <4CF9155C.26577.5C258720@list_ob.gmx.net> 42Bastian wrote: (I sent a direct reply by accident, now to the list as intended) > > can I tell gcc not to defer writes, possibly only to certain > > variables? > > No, not at all. If you need such, write assembly. > > The compiler has no idea of the underlying hardware. > It might schedule instructions if it knows the CPU core, but not w.r.t. > bus timing. I wasn't asking for such optimizations, but I find many deferred writes where I can't see any benefit. If the variable will be be written at some later time *) anyway, why does the compiler delay this write at all, IOW what is the intended benefit? It doesn't require less code, and it doesn't save execution time as far as I see. Oliver *) e.g. "volatile". gcc of course respects "6.7.3 Type qualifiers" (including footnote 114) of ISO/IEC 9899:1999. -- Oliver Betz, Muenchen From list_ob at gmx.net Sat Dec 4 12:44:07 2010 From: list_ob at gmx.net (Oliver Betz) Date: Sat, 04 Dec 2010 13:44:07 +0100 Subject: [coldfire-gnu-discuss] Control deferred writes? In-Reply-To: <4CF90E18.7040107@westcontrol.com> References: <4CF8FEA7.23858.5BCCD3BB@list_ob.gmx.net>, <4CF900E2.90804@sciopta.com>, <4CF90E18.7040107@westcontrol.com> Message-ID: <4CFA3797.23328.60934038@list_ob.gmx.net> David Brown wrote: [...] > control the ordering of the writes, the only issue is to control the > compiler-generated write instructions. > > There are three tools for doing that. One is to write at least some > parts of your code in assembly, as suggested. I already squeeze out every cycle by using assembly code where necessary, but I prefer the compiler to yield good results. > Use of "volatile" is important. All "volatile" writes will be generated > in the order expected by the program, and you will get no more nor less > than you ask for. But note that non-volatile reads and writes can be The "problem" is the concentration of deferred writes to volatile objects (e.g. at sequence points). Volatile writes shall not be optimized out or reordered, so if there is a write to a volatile, why defer it at all? [...] > Use volatile accesses, and memory blocks if needed, to enforce the write > ordering that you require. Then let the compiler handle the rest as > best it can. Maybe gcc could do better. Oliver -- Oliver Betz, Muenchen From art_bryan at hotmail.com Sun Dec 5 21:54:59 2010 From: art_bryan at hotmail.com (Arthur Bryan) Date: Sun, 5 Dec 2010 16:54:59 -0500 Subject: Linker Tutorial for embedded systems In-Reply-To: References: Message-ID: Linker Tutorial for embedded systems The purpose of this tutorial is to show how to read the linker script file and also the GNU linker document. It does not go into every aspect of the linker (m68k-elf-ld) but provides enough information to understand the CodeSourcery linker script file and to use it in embedded baremetal systems. It is necessary to understand the linker script file so as to ensure that only the code you want in the final executable is there and the source of that final executable. Knowing what is in the final code ensures that you have the smallest code necessary for your embedded system. It also can provide insight as to how best to use static ram and cache in your system. First some background; when you compile your program into an executable file, the compiler creates one or many position independent object files where this PIC code is placed in output sections. Position Independent Code (PIC) code means that final address locations have not been resolved. Why you ask? On final linking which is what we are addressing in this tutorial, you can choose to put your code at any location within memory and so a function you wrote could be at location ox10 or ox30. You can decide where using the linker script or let the linker do it automatically. These output sections become the basis for the input sections within the linker script. There is also additional code that is added by way of the linker script file. These too have output sections. This additional code is used to configure the controller and then set up your environment to execute your program from the main function. This additional code comes from archive object files such as libc.a, libcs3.a, libcs3coldfire.a, libcs3hosted.a, etc. I refer to these as archive files because each file listed is a combination of several files. To see what is in these files use the following command. C:\CodeSourcery\Sourcery G++ Lite\bin\m68k-elf-objdump.exe" -h -l -p -s -D c:\CodeSourcery\Sourcery G++ Lite\m68k-elf\lib\libcs3.a" > c:\temp\objdump.txt This file contains a number of object files. Below are just a few: start-sim.o start_c.o heap.o demoem-reset.o The linker is used to take your compiled source code and the above archive files and put them in certain physical locations in the resulting executable file and resolve function call addresses. The linker does numerous other things but these are it purpose. There are a number of things to understand about the GNU linker document. The first and most important thing to understand is the ?input sections? , ?output sections? and the ?SECTIONS? phrases. In particular the input sections actually comes from the output sections of your source object code in which libcs3.a is apart. Put another way, when you compile and link your program your compiler first creates one or more object files sometimes denoted with a ?.o? extension. In these object files your code is broken down into sections with names that are arbitrary but in most cases they are .text. .data, .bss, etc. Knowing the output section names you can then use them to map to the output sections of your final executable. The period before the .text is just part of the name and bears no other significance. Now the resulting executable file is in some sort of format. It may be in ELF, S19, PE or another format. We will only be considering ELF in our tutorial. And so the SECTIONS keyword defines the layout of the final executable. Below shows the syntax for the SECTIONS phrase. Think of the contents of linker file as being the skeleton layout similar to laying out a book. Think of the SECTIONS as the body of the document. SECTIONS { } /*End of sections*/ There is only one SECTIONS phrase in a linker file and within the curly brackets we then define the output sections that go into the exactable. SECTIONS { ".text" : { } .data : { } .bss.eh_frame : { } .eh_frame_hdr : { } .rodata : { } .cs3.rambar : { } } /*End of sections*/ Within the SECTIONS phrase you have ".text", .data, .bss.eh_frame, .eh_frame_hdr, .rodata, .rodata. These are just some ?output sections? that are being defined for the resulting executable you are creating. You will notice that ".text" has quotes and the others do not, I added the quotes to demonstrate that it is just a name and the period before it has no special meaning. The other output sections could also be quoted. What is unique is that these names are arbitrary and you could call them ?dog? and ?cat?. The reason why you will see so many examples with .text and .data is that it is a holdover from the a.out binary format. Now each output section can have one or more input sections from the source object files you have compiled and the added archive files. Remember when you compiled your main.c file, the linker also added some archive files at link time to get your application to work on your microcontroller. Think of the input sections as the paragraphs to each chapter where the chapter is the output sections e.g. ".text" and .data. You take paragraphs of information from your research which in this case is your source object files and the added archive files. The ?*(.text)? below is an input section that states ?if any of my source object files and archive files has a .text section place the code within this section here.? The asterisk (*) means any. ".text" : { *(.text) /*line 1*/ *(.text1 .rdata) /*line 2*/ *(.text2) *(.rdata) /*line 3*/ Foo.o (.text3) /*line 4*/ } Line 2 from above states to put .text1 and .rdata input section code here randomly. Line 3 states put .text2 and then .rdata input section code here. Line 3 could have put the .rdata input section on the next line down. And finally line 4 states put .text3 input section code from the Foo.o file here. Also because each line is sequential, code will be added sequentially. Let?s turn to where we want to put our code in memory. We have several methods of placing our ".text" output section and the other output sections in memory but I will only demonstrate one. First we will define our memory regions in our microcontroller. This is done with the below phrase. MEMORY { rom (rx) : ORIGIN = 0x00000000, LENGTH = 2M ram (rwx) : ORIGIN = 0x40000000, LENGTH = 16M rambar (rwx) : ORIGIN = 0x80000000, LENGTH = 16K } We have given each memory range a name: rom, ram and rambar. So now we will tell the linker to put ".text" in a particular location with the ?>rom? phrase. ".text" : { /*input sections here*/ } >rom This will place the ".text" code in rom starting at 0x00000000 Note that there is only one MEMORY phrase and it comes before the SECTIONS phrase. I have also seen this syntax ?>rambar AT>rom? in the codesourcy linker files but am unaware of how this works. RESEARCH NEEDED. ".text" : { /*input sections here*/ } >rambar AT>rom Above I have shown that I can place my input sections sequentially in my output section however each input section may not start at the most optimum location so I can adjust them individually by doing any of the following: . = ALIGN (8); _bdata = (. + 3) & ~ 3; variable = ALIGN(0x8000); I suggest only using the first statement since it adjusts the location counter to a location where the next input section listed after it will start. We now turn to the ?.? which is called the location counter. Think of this as a program counter whose address increments as input sections are added sequentially to each output section. You may see the location counter manipulated as above to change its address or get its address. The ld.pdf demonstrates why you would want its address. Note. Do not confuse this dot ?.? with those added to the above variable names. This is just a coincidence. The dot has to stand alone with white space before and after it. Let?s turn to reading a linker script from CodeSourcery. I will annotate each phrase just below it. OUTPUT_ARCH(m68k) /*This tells the linker what kind of controller it is creating the final executable for*/ ENTRY(__cs3_reset) /*This identifies the first executed location. An object dump of a simple application reveals that the second address holds the address of this location. I am using the m5208evb-rom.ld linker file and building against the coldfire 5208*/ SEARCH_DIR(.) /*It tells the linker to search the current location of the linker file for the libraries below*/ GROUP(-lgcc -lc -lcs3 -lcs3unhosted -lcs3coldfire) /*This state that all the above archive files have to be included in the linking and ALL PARTS RESOLVED or an error will result. ?lgcc is the libgcc.a and ?lc is libc.a . Look in location ?C:\CodeSourcery\Sourcery G++ Lite\m68k-elf\lib? for these files*/ MEMORY { rom (rx) : ORIGIN = 0x00000000, LENGTH = 2M ram (rwx) : ORIGIN = 0x40000000, LENGTH = 16M rambar (rwx) : ORIGIN = 0x80000000, LENGTH = 16K } /*We explained the MEMORY phrase before. The (rx) tells the linker what can be done to these regions. Rwx is read, write, execute */ EXTERN(__cs3_reset __cs3_reset_m5208evb) EXTERN(__cs3_start_asm _start) /* Bring in the interrupt routines & vector */ INCLUDE coldfire-names.inc EXTERN(__cs3_interrupt_vector_coldfire) EXTERN(__cs3_start_c main __cs3_stack __cs3_heap_end) /* Provide fall-back values */ PROVIDE(__cs3_heap_start = _end); PROVIDE(__cs3_heap_end = __cs3_region_start_ram + __cs3_region_size_ram); PROVIDE(__cs3_region_num = (__cs3_regions_end - __cs3_regions) / 20); PROVIDE(__cs3_stack = __cs3_region_start_ram + __cs3_region_size_ram); /* These force the linker to search for particular symbols from * the start of the link process and thus ensure the user's * overrides are picked up */ SECTIONS { ".text" : { CREATE_OBJECT_SYMBOLS /*Currently unaware of its intended purpose. I believe it is for debugging with the map file using the map file. More info can be found here. http://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_20.html */ __cs3_region_start_rom = .; _ftext = .; /*Think of these as linker variables to the location where they are listed. Although they are listed sequentially they have exactly the same address. You can use them as location pointers into your final compiled program. Because the ".text" output section is in rom and rom starts at 0x00000000 then both these variables has an address of 0x00000000 */ *(.cs3.region-head.rom) /*This states to place all code from my input object files (i.e. my archives and compiled application code) that lie within their .cs3.region-head.rom output section if any, in this area. This output section refers to those within the archive file and my compile code only, and not within the output sections I am now building.*/ ASSERT (. == __cs3_region_start_rom, ".cs3.region-head.rom not permitted"); /*This is used to throw an error. Here we see that if ?. == __cs3_region_start_rom? i.e. nothing was added then throw an error. You will notice that __cs3_region_start_rom held the location before the .cs3.region-head.rom and was evaluated against the dot ?.? after the region input section ends. A more readable version could be: ? __start-region-head = .; *(.cs3.region-head.rom) __end-region-head = .; ASSERT (__start-region-head == __end-region-head, ".cs3.region-head.rom is empty"); ? */end of example. __cs3_interrupt_vector = __cs3_interrupt_vector_coldfire; *(.cs3.interrupt_vector) /* Make sure we pulled in an interrupt vector. */ ASSERT (. != __cs3_interrupt_vector_coldfire, "No interrupt vector"); /*This throws an error also. */ PROVIDE(__cs3_reset = __cs3_reset_m5208evb); /* The PROVIDE phrase basically states that if your program defines it e.g (int __cs3_reset =20;) the program version will be used. If your program defines __cs3_reset as external and only references it then the statement within the brackets ?()? is true. */ *(.cs3.reset) /*Again this means place all the code within your .cs3.reset source output sections of your application and included archive files here */ _start = DEFINED(__cs3_start_asm) ? __cs3_start_asm : _start; /*Looks familiar? It should it?s a c if statement syntax. This however is not c code being compiled. It basically states that if __cs3_start_asm is in the linker global symbol table (ld.pgf, pg 73) then preserve it else use _start */ *(.text.cs3.init) *(.text .text.* .gnu.linkonce.t.*) /*We have seen these before. The .text.* uses the wildcard ?*? to say put anything that resembles it here. Also because it is inside the parenthesis it is intermixed with the other source output sections as opposed to being added sequentially as would be assumed. */ . = ALIGN(0x4); /*We have also seen this before. This tells the location counter to go to the next 4 byte boundary. We do this to ensure that if the previous input sections did not end on a 4 byte boundary that the linker will not fill it new input sections. Why do this? Some processors move data into cache in 4, 8, 16 and even 64 byte chunks. Because of this the processor may have to access memory several times to move your code and thus increase the processing time and slowing your speed. Some processors I believe will even stop processing or trash to process your code. There is a phrase SUBALIGN (Ld.Pdf, Paragraph 3.6.8.4) in the linker document that allows you to even breakup your input section code and place them on boundaries.*/ KEEP (*crtbegin.o(.jcr)) /*If ?--gc-sections? is used then the linker will throw away sections if understands cannot be executed. The KEEP phrase tells the linker to leave this input section in place*/ KEEP (*(EXCLUDE_FILE (*crtend.o) .jcr)) /*The EXCLUDE_FILE (*crtend.o) tells the linker not to include any .jcr section from the crtend.o file */ KEEP (*crtend.o(.jcr)) /*Now here it added the .jcr sections from the crtend.o file */ . = ALIGN(0x4); *(.gcc_except_table .gcc_except_table.*) } >rom I have not covered using linker symbols from your application code. You can think of linker symbols as address references to locations within the final linked application. These can be used to identify and move areas of code and data from one location to another or access this data placed directly by the linker instead of your program. Section 3.5.4 Source Code Reference of the ld.pdf gives an example I have also not covered how to replace those predefined libraries with your own. Summary. When you create your final executable program it goes through several steps. First your source code is compiled into position independent object code that has output sections. Next, these output sections become the basis for input sections of your linker script. Your linker script then places these various output sections now input sections into output sections for your final executable. Since we are dealing with the elf format and using similar section names then talking about output and input sections sometimes become ambiguous. The linker uses dot ?.? location pointer to identify and resolve addresses within the final executable. What was once PIC code in now position dependent but know because it was defined by the linker script. We can also manipulate the dot location pointer to move to various locations to place our code. Please do hesitate to comment or add to this tutorial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom_usenet at optusnet.com.au Tue Dec 7 04:04:18 2010 From: tom_usenet at optusnet.com.au (Tom Evans) Date: Tue, 07 Dec 2010 15:04:18 +1100 Subject: [coldfire-gnu-discuss] Control deferred writes? In-Reply-To: <4CFA3797.23328.60934038@list_ob.gmx.net> References: <4CF8FEA7.23858.5BCCD3BB@list_ob.gmx.net>, <4CF900E2.90804@sciopta.com>, <4CF90E18.7040107@westcontrol.com> <4CFA3797.23328.60934038@list_ob.gmx.net> Message-ID: <4CFDB242.3020802@optusnet.com.au> Oliver Betz wrote: > Volatile writes shall not be optimized out or reordered, As long as you don't run into these: http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf Tom