Nablas Blog

The first thing that you need to do after maybe understanding the basics of micro controller is to set up a tool chain. The tool chain is the set of programs that help you get the code you have written onto the chip that is supposed to execute it.

The Difference to local development

If we think about how writing programs on your computer works it seems fairly simple, write it and then call the compiler that will examine the code and compile it if no errors can be found. Then the program is on disk somewhere. When the compiled program is executed it is loaded into main memory by the operating system and then linked at runtime with so called dynamically loadable libraries (this is done by the system linker) to build a running program. The program that the compiler produces is must therefore contain meta information about what libraries are needed to tie up the loose ends inside the program (as they link to the libraries loaded at runtime). If the system linker can't find them the program cannot execute and the program "crashes" (actually the linker emits a warning and stops the program from even starting to execute).

When we look at a micro controller there is no such thing as an operating system or something like a file (in the abstract sense). Neither is there such a thing as a system linker (which relies on the operating system and system compiler in turn). So all the fancy magic that happens when an application is compiled natively is no longer at our disposal. We need to tell the different tools that work on the code we have written what environment the code will execute in and set up a way to get the result onto the chip. Also all code has to be statically linked as there is nothing else on the micro controller other than the code we are executing.

The Interface

As we are going to write the code for our micro controller on a "host" machine and will most likely also want to compile it there too, so we need a mechanism to get the compiled binary from the host machine to the target machine. As the micro controller, commonly called "target" in this context does not speak USB without the proper firmware, if it has a USB interface at all, we need a different mean to get the code onto the device.

This is where the debug-probe comes into play. The debug probe is a piece of hardware that can on the one side talk to the "host" computer using various different methods on top of USB and on the other side talk to the micro controller via JTAG or SWD connections.

JTAG or SWD are essentially similar to USB in purpose (as the purpose of all of these bus systems is to link two (or more) computers with each other) with the difference that JTAG and SWD are optimized for the specific task of development of micro controller. Most micro controller have special hardware to interface with these protocols without software support and these buses are the primary way that executable code is loaded into the micro controller.

Initially JTAG is actually a protocol (and associated hardware) that was (and is) used to check the electrical characteristics on the pins of a chip mounted on a PCB. It is used to verify that the hardware, with all the chips and peripheries attached is in fact in working order and there is no malfunction in one of the IC's. This is pretty cool because you then don't need any test points or pins on the production boards reducing the cost of producing one unit. It gained the features for debugging a chip over time and it is all I have ever used JTAG for.

I'm not so knowledgeable about SWD but it is pretty common for STM controller and many debug probes seem to support it. It's most likely an ARM thing which is why this would be so wide spread in ARM Cortex-M development.

Some (the good ones) debug probes will also do the work of powering the target and detecting various things about it. They will also come with some protections against short circuits and other development mishaps. Debug probes are normally not terribly cheap (except the ones from Aliexpress of course) but should last fairly long.

Every debug probe comes with some supporting software as there still needs to be a piece of software on the host to actually talk to the debug probe and interpret what it sends back over the USB connection. This software really depends on the probe but much of that software has aggregated into the OpenOCD project that has drivers for all kinds of different JTAG and SWD endpoints.

Different hardware

The amount of different pieces of hard and software was pretty overwhelming for me at first

For my primary work I'll be using the Black Magic Probe as debugger It is small, open source and implements the GDB-Server without the need for host-side software and implements a serial port natively in Linux that can be used to talk to the Chip. It is however not the cheapest hardware out there (if you want to buy the "original" hardware and not some ripof (in this case you will be supporting the project and that is always nice)). The use of the BMP is documented here.

ST-Link V2

This piece of hardware is a Chinese ripoff of ST's on board programmer and debugger they ship with the 'discovery' line of evaluation boards. It's based on the STM32F103 and is fairly simple. It's essentially a protocol translator and transceiver. Pro: I's dirt cheap Con: It lacks some of the protections other probes come with. It also can only do SWD. All in all it is a good place to start I guess, especially when you consider the ten to fifty fold price increase to a "professional" debug probe.

Segger J-Link

A proprietary piece of hardware that interacts with proprietary software, namely JLinkExe and JLinkGdbServer it's use is well documented and it performs well in my experience. The docs can be viewed here. I like it but it's proprietary and you need to learn proprietary software to use it. It has a fairly good GUI which can really help (even though you don't really need it as GDB comes with it's own 'TUI' interfaces.

The Software

As way of doing things I'll work backwards from the micro controller. So the current setup is the following: micro controller, that is connected to the debug probe via either JTAG or SWD which in turn is connected to the host via USB. Now we need software on the host to get the compiled binaries onto the chip and to test it in it's environment.

This is again dependent on the debug probe used. The commercial options are mostly well documented which is the main difference to the 'hacked' counterparts. For me OpenOCD has been the go-to solution for software side support for most probes of the 'hacked'/open source kind. It provides a GDB server that is then translated into the USB Commands of the specific debug probe. For the most common ones the configuration for the probe is also shipped with OpenOCD and has to be loaded alongside the configuration for the chip. Everything else can then be done via a GDB client that connects to the OpenOCD implementation of the GDB-Server.

As I am working with rust there are cargo commands, especially cargo flash that take care of talking to the different debug probes with the right settings so that the binary that has been compiled can be flashed using the cargo flash command. They seem to avoid using OpenOCD to start a GDB server because the only thing they want to do is to load a binary onto the controller. This is probably pretty handy if you are using a discovery board from ST or similar development boards that come with an in built debug probe.

As stated in the section above most tools come with their own software utilities to do the usual things like writing the compiled images to the chips flash memory or erase it. Most tools do utilize the GDB protocol for debugging. So on the software side of things we have the following stack:

debug-utility (if required): this will work with the particular hardware and produce a set of commands that can be executed from the command line or from a connected GDB client.
The GDB client. The important thing here is that the GDB Client in question needs to be able to interpret the instruction set for the target processor. When doing ARM development this is fairly strait forward, but will get more and more complicated the more obscure the chip gets. For some targets you will have to rely on proprietary tools from the chip vendor to get decent debugging support and that may mean you have to use Windows or other particular operating systems.

The Compiler and binutils

So the "last" or "first" part depending on your view, is the compiler and linker. They turn the written code into the bits that actually make the processor tick. Here there is a lot to unpack, because there needs to software support for the chips as well as the compiler actually being capable of producing binary output that is compatible with the target architecture. Then as we are not running on an OS the different instructions actually have to be placed at the right spot in memory so that the controller starts up as expected. Also micro controllers exhibit a behaviour called Interrupts that make the controller jump to a preprogrammed memory location when a certain external event (a pin goes high for example) happens. These memory locations have to be specified and code has to be placed at the destination of the jump to stop the controller form executing random stuff that just happens to be there causing lots of undefined behaviour.

Memory maps and linker scripts

Continuing the bottom up approach I'll start with the memory map and linker script. The memory map tells the linker where different sections of memory are located within the address space available to the controller. All microcontroller known to me map flash and RAM and all other registers of special purpose hardware into the logical address range. This means that to be able to access a special sort of memory all the controller has to do is to is to access a special address range. As far as I know "normal" x86 chips work in the same way, however have the difference of a memory management unit that can be only be accessed from restricted instructions and that the kernel configures to make the memory look contiguous and homogeneous for the processes running on top of the OS. As we don't have the luxury of an OS we have to deal with the inhomogeneous and non contiguous peculiarities of the actual memory layout and have to communicate that to the linker that builds the final image. This is what the memory map is for. The memory map (at least the part that changes from chip to chip) has to be specified in the memory.x file that is used by the linker script to describe the basic memory properties of the device.

As we are not using the standard linker setup we also need to change things here. Fortunately the people form the rust embedded group have already taken care of that and provided a linker script for us as part of the cortex_m_rt crate. We simply have to specify that the linker should use that script instead of the standard one. The linker script defines the symbols and locations for the interrupt vectors and the memory location where the first instruction for the chip is placed which is called the entry point. The script also decides where the stack is placed and has to take the calling convention of the compiler into consideration. For cortex m devices the Stack is either placed at the end of the RAM section or in a special section of RAM that is more tightly coupled to the processor and is therefore a bit faster speeding up function calls.

the vector table and Sections

The vector table and the Sections are the things that essentially turn into addresses/structs in the compiler. The compiler has to know that these things exist even if it does not know where they are (that is then the job of the linker after all). So the symbols for these sections and the interrupt vectors have to be defined for the linker on the one hand and for the compiler on the other. The Sections are the places where the compiled instructions and the data for those instructions are placed in the memory image. There are 5 sections in images for cortex m devices. These are:

.vector_table: This is a section that defines the start and size of the vector table. The vector table is a list of addresses that the CPU jumps to when the interrupt is triggered.
.text: This is the section where the instructions for the CPU are kept. No data should be found here, as it lives in other sections.
.rodata: this section is where read only data is kept. As it is read only it can live in the (normally much larger) flash and does not have to be kept in RAM.
.data: this section holds all the data for the program. This data is initialized with a value different from 0 and needs to be copied into RAM at the start of the program.
.bss: This section is the section that holds only data that is initialized as zeros. The zeros are not necessarily written into the image, instead the section is kept clear by the linker and the 0s are written into the section at power on.

The compiler

Now the last thing that really is still to do (that is not yet writing software) is to tell the compiler what kind of machine instructions to produce as the result of the compilation process. This may be a simple thing to do for the user but is definitively not a simple task for the designers of the compiler and the people that need to implement the different architectures. The word-width, amount and special facilities of the different registers, and not least the command execution pipeline (each instruction generally is split into three steps, the LOAD, DECODE and EXECUTE steps) and it's specifics have to be taken into consideration when writing these parts of the compiler. (Frankly the last step is really only useful so that the compiler can structure the executed code in such a way that subsequent commands can utilize more parts of the core simultaneously, but is not strictly necessary to get working code (but is very useful to get well performing code).

Summary

Setting up a toolchain for a microcontroller is not quite as easy as simply starting to write the program for the environment that it was written in. We need to be able to talk to the physical chip for which we need an interface from the host to the target IC. This is commonly called the Programmer or debug probe and there are very many different pieces of hardware that can fulfil that job, each with it's specialized interface software. Most of these debug probes can be interfaced with using OpenOCD that wraps the device specific commands into a GDB server that a fitting GDB Client can connect to. From there the programmer can issue commands to reload the binary or set breakpoint or interrupt program execution using various triggers.

To be able to produce a binary for the micro controller the linker, that is the program that takes all the small bits of compiled code from the compiler and builds one contiguous image from that needs to know where to put what part of the program in the memory of the controller (we don't want to store the program part in volatile or illegal memory addresses after all. This is done by specifying a memory map, which is done in the cortex-m crate with the exception of the device specific RAM and FLASH layout that have to be specified in memory.x. The cortex-m and the device specific crate also take care of initializing the interrupt vector table and point the reset vector to code that initializes the variables in RAM and then defers to the code specified by the #[entry] macro.

The next challenge will be understanding the environment provided by the different device and hal or hardware abstraction level crates and the underlying hardware of a microcontroller that needs to be set up in our code before we can enter the loop where we actually run the code that controls whatever we want to control with this incredibly powerful microcomputer (I know they are dwarfed by modern laptops but measured by their ancestors they are tiny but incredibly powerful).

'till next time. Cheers.

Setting up the tool chain for developing with embedded devices