This part of the tutorial will cover probably one the most important part of development – debugging. This can be done on microcontrollers due to the effort of engineers in the 1990 who initiated JTAG. Even though, it was not invented for the debugging reasons but for boundary scanning, which is the method for testing connections on PCBs. Firstly, I will briefly visualise how JTAG developed between 1980s and today. I will then show how it is included on the discovery board and what happens when you are debugging. Finally, I will provide some practical examples and techniques for you to debug your application. And lastly, I will show the main reason why JTAG exists – we will use BSDL to run a boundary scan test on the discovery board!
Wait… JTAG is not an Interface?!
I heard many people thinking of JTAG as an interface on the microcontroller which you can use to debug your firmware. This is only partially correct… JTAG (Joint Test Action Group) or the boundary-scan is the IEEE 1149.1 standard. It defines structures and methods for boundary scan testing [1]. The JTAG committee initiated it in the late 1980s due to the demand for automated testing of PCBs. Boundary scan can be used to validate manufactured PCBs on the chip-level by checking for short, open circuits and circuit faults in general. This is especially important now because circuits are shrinking in size and chip packages such as BGAs (ball grid array) are hardly accessible from external testing equipment. JTAG evolved since 1990 and now can be used for in-system programming, boundary scan testing and software debugging and emulation.
Chip manufacturers implement the Test Access Port (TAP) defined in the standard to give developers ability to connect with the microcontroller over JTAG interface. It consists of 4 or 5 pins on your microcontroller – TDI, TDO, TMS, TCK and an optional TRST. I will describe it in more details later. On top of that, STM32F407VG provides SWD (serial wire debug) interface in addition to the standard JTAG. This only uses 2 pins – SWDIO and SWCLK shared with TMS and TCK, respectively [2]. This is ARM’s approach to provide interface for debugging only [3].
JTAG Timeline
You might now ask, ‘how many variations of JTAG interface are out there’? Well, JTAG had many extensions and is still evolving. In the last 30 years the standard was extended with BSDL (Boundary-Scan Description Language) for automated tests development and mixed-signals testing. Similarly to ARM’s SWD interface, there is a compact JTAG (cJTAG) which is also based on 2-pin connection. Furthermore, another standard called IEEE 1687 emerged in 2014. As more devices (IPs) were built into chips there was the need to provide high level protocol for them to communicate. IEEE 1687 accomplished that by leveraging TAP and creating a hardware architecture of a network of in-chip devices and its communication protocol [4][5]. Figure 1 illustrates major events of JTAG development.
TAP Controller
Now, let’s take a closer look into how JTAG is used. Inside of every chip that supports JTAG standard, the manufacturer adds TAP circuitry along with some registers such as an optional Device ID, Bypass, Data and Instruction registers. When you wish to re-program, debug or do boundary scan testing on your board you can send data from your machine over JTAG lines. TAP Controller is a unit that controls JTAG. It consists of a 16-state state machine controlled by two lines – TMS and TCK. I added Figure 2 showing JTAG on multiple devices, as well as the state machine inside the TAP controller.
If your PCBs contains multiple ICs then you can daisy-chain them with JTAG. This makes it flexible to test the whole circuitry, however with the cost of a slower response because of a single data serial line. Luckily, JTAG has a compulsory Bypass register which bypasses data from TDI directly into TDO. Therefore, you can only test a single chip in your whole daisy-chain without having to pass the data all around the boundary register cells on every chip. To make life easier TMS and TCK is connected to every chip in parallel which means that every chip’s TAP state machine is always in the same state.
TAP Operations
Inside the state machine you can transition from one state to the other by changing the TMS signal. Transitions occur on a rising edge of TCK. The initial state Test-Logic-Reset can be accessed during power-up, when an optional TRST is activated or after TMS is high for at least 5 clock cycles. The state machine can essentially be divided into a data column on the left and an instruction columns on the right as shown in Figure 2. The instruction can be loaded by shifting instruction code bits while staying in the Shift-IR state. Then by transitioning from Exit1-IR to Update-IR the instruction is latched into the Instruction Register and decoded. Similarly, data can be loaded or read-back by shifting data bits and staying in the Shift-DR state.
The standard defined multiple instructions to be implemented, however in most cases, it is up to the designer to define the actual instructions codes. The mandatory instructions include BYPASS, SAMPLE, PRELOAD and EXTEST. Some of the optional ones are IDCODE, USERCODE, INTEST, RUNBIST, HIGHZ or CLAMP [6]. I added a quick summary of these instructions in Table 1. Also, JTAG standard allows to be extended by user-defined instructions and registers. Manufacturers such as STMicroelectronics or Texas Instruments use that feature to extend the interface.
Instruction | Description |
BYPASS | Bypasses boundary register through the bypass register |
SAMPLE | Captures input data to boundary register |
PRELOAD | Loads boundary scan register cells with new data |
EXTEST | Outputs pre-loaded data from boundary register to ports |
IDCODE | Reads an optional device ID register |
USERCODE | Reads an optional user programmable ID register |
INTEST | Outputs pre-loaded data from boundary register to System Logic |
RUNBIST | Runs IC’s self-test |
HIGHZ | Sets ports to high-impedance mode |
CLAMP | Similar to EXTEST but also bypasses TDI to TDO |
From GDB via STLINK to TAP (Test Access Port)
Firstly, it is beneficial to understand what happens under the hood during debugging. Therefore, we will demystify the signal path from GDB down to the microcontroller. The same happens when you debug using your IDE such as STM32CubeIDE [7].
There are essentially four critical components within an on-target debug session – GDB Client, GDB Server, Debug probe and a microcontroller. To illustrate that, Figure 3 shows how they communicate between each other.
GDB client is your main friend. It lets you step through your code, stop at breakpoints and watch data. Most importantly though, it shows where your program crashes. Then, there is GDB Server which also runs on the host. It communicates with a GDB Server, mainly over TCP/IP, and sends/receives data to/from a debugger. For example, it will receive the GDB’s command to read a register, transform it into the debugger’s protocol and send over USB to a microcontroller
Many debug adapters exist such ST-Link from STM [8], ARM-USB-TINY-H from Olimex [9] or J-Link from Segger [10]. They mostly provide an interface from USB to a microcontroller’s debug port. In other words, ST-Link receives/sends USB commands and transforms them into JTAG/SWD signals. Lastly, these signals access microcontrollers’ on-chip debugging infrastructure defined by JTAG standard. They eventually operate the TAP’s state machine and access chips’ registers.
On top of that, ARM provides additional debugging features such as Embedded Trace Macrocell for data and instructions tracing [11]. They are implemented according to ARM’s CoreSight architecture [12]. This is the reason why JTAG/SWD signals in Figure 3 are connected to the STM32F4xx microcontroller through the DAP (Debug Access Port) interface [13]. DAP is ARM’s debug port to re-use the same debugging infrastructure for different architectures such as ARMv7, ARMv8, etc [14].
Tools for debugging discovery board
So how can we get started with debugging our board? To do this, we will need to get some tools and connect our physical setup. Firstly, let’s see what tools are out there.
1) GDB Client
GDB was already installed as part of the ARM GNU Toolchain installation in cross-compilation tutorial. To check if you successfully installed the toolchain you can invoke arm-none-eabi-gdb in your terminal. After that, you should get a similar terminal output to mine.
Using ARM GNU Toolchain is not your only option – you can try ARM LLVM toolchain [15]. Also, some development packages come with their own safety-critical toolchain such as ARM’s Keil MDK [16].
2) GDB Server
Multiple options here, too. In fact, we already used one GDB server, stlink, to reflash our discovery board in previous tutorials. To list other options, I made a short list of 3 open source tools I found quite useful:
- Stlink – open source STLINK tools supporting multiple Cortex-M-based microcontrollers. It is very simple to use and offers basic functionality such as flash writing, logging, chip information tool and an optional GUI [17][18]
- pyOCD – this is a quite exciting Python package for debugging tons of Cortex-M devices. Great tool for scripting automated tests and offering multi-core support [19]
- openocd – massively popular with available support for various debug probes and targets, which are not just limited to ARM architecture [20]. Although it is not completely bug-free, we will use it in this tutorial because it provides support for our debuggers and it is worth knowing how to use it.
Therefore, you can install openocd on your system with the following command in your terminal:
sudo apt-get install openocd
You can check your openocd version now
3) Debug adapter (a.k.a probe, dongle or debugger)
The list of debuggers available on the market today is endless. For simple use-cases such as re-programming and debugging you can stick with fairly inexpensive ST-Link or ARM-USB-TINY-H. These are the ones we will use in this tutorial. On the other hand, you can find high-end debuggers coming with more sophisticated tools for profiling, production testing, and supporting many architectures. You can get them from companies such as XJTAG [21] or Lauterbach [22], however, they come with a higher price.
The reason for using ARM-USB-TINY-H in addition to ST-Link is the issue with sending raw JTAG commands with openocd to ST-Link devices. Therefore, we will use ST-Link/V2 with SWD protocol for programming and debugging, and ARM-USB-TINY-H for boundary scanning.
Configuration setup
Now it is time to roll up your sleeves and get your setup dirty. We will use two different setups using ST-Link for debugging and an optional configuration using ARM-USB-TINY-H for boundary scanning.
Setup #1 – USB + On-Board ST-Link
Luckily, discovery boards already have an on-board ST-Link embedded into the second microcontroller – STM32F103. Therefore, the ST-Link chip receives commands over USB and sends them to STM32F407 using SWD interface. On top of that, some boards have stlink/V2 while others are updated to stlink/V2.1. Depending on the version you will use different openocd’s interface implementation. To make this setup you only need an USB A Male to USB mini cable. Also, do not forget to have jumpers on the CN3 connector.
Setup #2 – ST-Link/V2 probe + SWD
To use external ST-Link/V2 debugger with SWD interface you need to connect 5 signals from your debugger to the discovery: VDD, GND, SWCLK, SWDIO and NRST. GND, SWCLK, SWDIO and NRST can be connected to SWD header. VDD can be connected to any port with the same reference or to the SWD connector if R2 resistor is soldered (not by default). Unfortunately, ST-Link/V2 will not provide power to your board. Therefore, you have two options – either supply external +5V/+3.3V to the board or provide power over USB as in the previous setup. Similarly to before, jumpers should on the CN3 connector.
Setup #3 – ARM-USB-TINY-H + JTAG
In this setup we can ignore CN3 and SWD connector. Instead, we connect JTAG lines directly from the arm-usb-tiny-h debugger to the pins exposed from the microcontroller. Since we are using the basic JTAG connections we will need at least 4 signals – TDI, TMS, TCK and TDO. These signals can be connected to PA15, PA13, P14 and PB3 GPIOs, respectively. All the exposed GPIOs are listed in the discovery datasheet. Although it is possible to supply power from arm-usb-tiny-h probe, I found it less stable than power over USB or from the external power supply.
To do the wiring correctly you can refer to Figure 4.
Also, you can verify that your connections are correct by reflashing the board with openocd. I added commands for reflashing to the makefile in the repository. Once you are in the tut3_debugging directory then run following commands
make clean; make all
# for Setup #1 and #2 with stlink/v2
make stlink_v2_reflash
# or for Setup #1 and #2 with stlink/v2-1
make stlink_v2.1_reflash
# for Setup #3 (arm-usb-tiny-h)
make olimex_reflash
Your terminal output should indicate reflashing success similar to the one below
Enough talk – let’s debug!
I split this section into multiple mini-parts, each of which showcases a specific technique or a tool for debugging. I am assuming that your setup is ready and you installed both openocd and ARM toolchain. If not, please refer to previous sections of this tutorial. I will use the first setup with USB only but any other will work for the upcoming examples.
OpenOCD & telnet reprogramming
Let’s start with something you already did a couple of times – reprogramming. This time, however, we will use Telnet to send openocd commands one by one.
In the first terminal, change directory to tut3_debugging directory. Then run openocd passing an argument with the configuration file loading stlink interface and stm32f4x target. All configuration files are saved in the tools directory.
Terminal 1 (OpenOCD):
cd <embeddedTutorial-repo-path>/tut3_debugging
make clean; make all
openocd -f ../tools/stlink-v2-swd.cfg
In the second terminal, open telnet client passing localhost and TCP port 4444 as arguments. This will establish TCP/IP connection with an openocd session.
Terminal 2 (Telnet):
telnet localhost 4444
Now you can check if openocd can detect stm32f407 microcontroller. You can do that with targets command. You should have a result similar to mine.
In your telnet session you can send commands to load the binary image of tut3.bin to 0x8000000 address on your microcontroller. You can then reset the microcontroller and should see a blinking LED if reflashing succeeded.
Terminal 2 (Telnet):
reset init
load_image tut3.bin 0x8000000
reset run
The same result can be achieved by running a program command which we call directly inside our Makefile with make stlink_v2_reflash. You can verify that by checking our script in the Makefile.
stlink_v2_reflash:
openocd --file $(shell pwd)/../tools/stlink-v2-swd.cfg -c "program tut3.bin 0x08000000 exit"
Setting breakpoints
You might be asking, ‘That’s all great, I can reflash. But how do I debug now?’ That’s when our debugger from ARM toolchain comes in handy. You can follow the previous tutorial if you do not have it installed yet. Otherwise, let’s fire up openocd like before in one terminal. In the second terminal, run a gdb from the toolchain and pass the binary in .axf format as an argument. This will load a binary with debug symbols into your session. After that, you can make a TCP connection between the gdb and openocd with target extended-remote localhost:3333 or tar ext :3333 for short.
Terminal 2 (GDB):
arm-none-eabi-gdb tut3.axf
(gdb) target extended-remote localhost:3333
Let’s insert a simple breakpoint now. Since our debug symbols are loaded, we can insert a breakpoint just after the LED is toggled, which happens at line 40 in main.c. This way we can toggle the LED every time we hit the breakpoint. You can do that with either break <file>.c:<line_number> or simply b for break.
Terminal 2 (GDB):
(gdb) b main.c:40
You can then reset the target, stop its execution and run it once again until it hits the breakpoint. Once you stopped at the breakpoint you can turn the GDB session to a visual mode by typing CTRL+X followed by CTRL+A. I find the visual mode quite useful to navigate around the code and see more context.
Terminal 2 (GDB):
(gdb) monitor reset halt
(gdb) run
for visual mode: CTRL+X, CTRL+A
You can continue the execution with a continue command or c for short. As long as the breakpoint is set, you should stop at line 40 and see the LED being toggled.
Conditional breakpoints
Breakpoints are quite useful and let you check program’s execution context. Also, you can set a breakpoint to be triggered at a certain condition. For example, in addition to the previous breakpoint, let’s say we want to stop at line 38 only if time_counter reaches 50. We can set that breakpoint with an if condition like this
Terminal 2 (GDB):
(gdb) b main.c:38 if time_counter==50
(gdb) cont
If you want to check the state of your variables once you hit the breakpoint then use the command print <var> or p <var> for short. Once you have two breakpoints you should be stopping at line 38 and 40 interchangeably. It also helps to have variables to be inspected declared with volatile qualifier. This way, the compiler will not optimise them and you will definitely be able to inspect them during debugging session.
You sometimes might lose track of where the breakpoints were set. To find out you can run info breakpoints or info b. This will also tell you how many times each breakpoint was hit.
You can see that every listed breakpoint has a number assigned to it. If you wish to delete a breakpoint or a range of them, you can use del <breakpoint_num> or del <break_num_start>-<break_num_end>.
(gdb) del 1 -> deletes breakpoints number 1
(gdb) del 2-3 -> deletes breakpoints with numbers between 2 and 3
Tracing HardFault
The real power of gdb arises when your program crashes or magic numbers show in memory regions. Then, you will need to know how to find the root cause of the fault. Therefore, in this session we will intentionally inject an invalid memory access bug. Namely, we will write NULL(0) value to a reserved memory register. To enable the bug uncomment line 44.
43 /// uncomment line below to inject a bug
44 RESERVED_MEMORY = NULL;
Writing to an invalid memory register will result in a HardFault on Cortex-M4 CPU. Hard faults have multiple causes such as invalid instruction execution, forbidden memory access, stack overflow and many others. Whenever a hard fault occurs, an interrupt will trigger. To act upon HardFault_Handler I added a system reset in main.c.
47 void HardFault_Handler(void) {
48 clocks_system_reset();
49 }
After you uncommented line 44, you can compile the program, run openocd in one terminal like before and re-load a defective binary onto your microcontroller. Afterwards, insert two breakpoints at SysTick_Handler and HardFault_Handler. You can do that with b <function_name> command. After running the program, execution should immediately stop at SysTick_Handler. Once you are there, you can step through the code using next or n command. This will step down to our bug line by line until you hit HardFault_Handler breakpoint.
Terminal 2 (GDB): // openocd in Terminal 1
make
arm-none-eabi-gdb tut3.axf -ex "target extended-remote localhost:3333" -ex "monitor reset" -ex "load" -ex "monitor reset init"
(gdb) b SysTick_Handler
(gdb) b HardFault_Handler
(gdb) r
(gdb) n # 3 times or 2x ENTER after the first n
Some of the most useful commands at this point will be backtrace and info registers or i r. If you call backtrace command you should see the trace of the faulty instruction call. However, sometimes this is not so easy and you will have to check stack registers with info registers to unwind the trace [23].
Terminal 2 (GDB):
(gdb) backtrace
(gdb) i r
After calling backtrace you can see that the HardFault_Handler was triggered after a call to main.c:44, which is the root cause.
This scenario was quite straightforward and we knew where the issue was. However, when your codebase is larger tracing the fault becomes one of the necessary survival skills. This introduction to gdb should be a good starting point in developing a debugging skillset.
Below firmware testing – Boundary Scan
In this part we will finally show the reason why IEEE 1149.1 standard was created. Hence, we will use STM32F407 BSDL file together with openocd to check if our GPIO is pulled down or up. On top of that, we will blink LED5 on discovery board with no firmware – by raw JTAG commands.
Firstly, you will need to prepare discovery board with arm-usb-tiny-h setup. You will also need the BSDL (Boundary Scan Description Language) file for STM32F407 which is provided by the manufacturer. I already added one corresponding to a LQFP100 package to the tools directory in the repository. Important attributes which this file describes is the values for implemented JTAG commands.
250 "BYPASS (11111)," &
251 "EXTEST (00000)," &
252 "SAMPLE (00010)," &
253 "PRELOAD (00010)," &
254 "IDCODE (00001)";
You can also find the length of the BSR (Boundary Scan Register) or how to control GPIOs. In this example, we are interested in two GPIOs – PD14 connected to the red LED5 and PB9 for input reading. By examining BOUNDARY_REGISTER defined in the bsd file we know that the length is 406 because the last defined cell number is 405. Also, there are 3 register cells used for each GPIO – output, control and input. The ones defined for PB9 and PD14 are shown below
PB9:
687 "20 (BC_1, *, CONTROL, 1), " &
688 "19 (BC_1, PB9, OUTPUT3, X, 20, 1, Z), " &
689 "18 (BC_4, PB9, INPUT, X), " &
PD14 (LED5):
545 "162 (BC_1, *, CONTROL, 1), " &
546 "161 (BC_1, PD14, OUTPUT3, X, 162, 1, Z), " &
547 "160 (BC_4, PD14, INPUT, X), " &
Therefore, if you want to read PB9 input then you can send a SAMPLE (0x2) instruction and then read cell number 19 of the BSR. On the other hand, to control the LED5 you have to send PRELOAD instruction, which is the same as SAMPLE, then enable control register by loading 0 to cell 20 and control GPIO output with cell 19. Let’s start with reading PB9 while externally pulling the pin down and up.
Sampling PB9
We will start with pulling down PB9 to the GND by simply connecting these two pins together on your discovery board. After that you will open one terminal to run openocd with ARM-USB-TINY-H configuration and a second terminal with telnet.
Terminal 1 (OpenOCD):
make olimex_connect
Terminal 2 (Telnet):
telnet localhost 4444
You should notice that connecting with openocd and ARM-USB-TINY-H debugger results in finding two TAP devices. That’s because STM32F4xx has indeed two TAPs connected in series – one for boundary scan and one for Cortex-M4 [24].
We can also scan the number of detected TAPs by scanning the chain from the Telnet terminal. Also, we can send IDCODE (0x1) instruction followed by reading data register of 32 bit. This will result in reading the ID of stm32f4x.bs as below.
Terminal 2 (Telnet):
> poll off
> scan_chain
> irscan stm32f4x.bs 0x1
> irscan stm32f4x.bs 32 0
There are two basic JTAG commands we just used – irscan for sending instruction commands and drscan for sending data commands. Now the TAP state machine I showed earlier should start making some sense as it had two columns which dealt with these commands. Both instructions take name of the TAP device, which is stm32f4x.bs, as the first argument. For irscan command the second argument is the actual instruction code, whereas, drscan takes number of bits and a value. Data value is needed for drscan, however, it will only shift the stored data register. Also, the initial poll off command disables polling all detected TAPs periodically which would otherwise conflict with our commands.
We can now read the PB9 input by first sending SAMPLE instruction followed by drscan command with the BSR’s length of 406 and zero value (does not matter). You should get a value stored in the current BSR. You can then identify bit 19 which reflects PB9 Input value. It should be equal to zero. After that you can pull PB9 GPIO up to Vdd with a wire and send the same drscan command. Your output should show bit 18 changed to HIGH. You can find commands and expected output below.
Terminal 2 (Telnet):
> irscan stm32f4x.bs 0x2
> drscan stm32f4x.bs 406 0
Connect PB9 with Vdd
> drscan stm32f4x.bs 406 0
Controlling LED5
Writing to PD14, which LED5 is connected to, is no complicated than reading PB9 Input above. However, instead of sending zero value to the BSR we will read it back, modify to control PD14 and keep toggling it by sending drscan commands. Before we do that, we need to send an EXTEST(0x0) instruction. As you can see in Table 1, EXTEST instruction will connect pre-loaded value in BSR to the output logic
If you are not continuing from the previous section then you will need to open both terminals and send poll off command in the Telnet one. After that we can load an EXTEST followed by writing 406 ones so that we shift the current BSR value out and read it. We can prevent outputting the value of all ones to ports by specifying the end state of drscan to be drpause. This should return the current BSR value which we can use later to control LED5. The initial drscan command will be modified. We will split the 406 bits into 12 32-bit and 1 22-bit chunks. This format will be more readable. Also, the first 4 bytes will be least significant.
Terminal 2 (Telnet):
> reset init
> poll off
> irscan stm32f4x.bs 0x0
> drscan stm32f4x.bs 32 0xFFFFFFFF 32 0xFFFFFFFF 32 0xFFFFFFFF 32 0xFFFFFFFF 32 0xFFFFFFFF 32 0xFFFFFFFF 32 0xFFFFFFFF 32 0xFFFFFFFF 32 0xFFFFFFFF 32 0xFFFFFFFF 32 0xFFFFFFFF 32 0xFFFFFFFF 22 0x3FFFFF -endstate drpause
<current_BSR_value>
You will then need to copy the returned value of BSR and send it with drscan with bit 162 and 161 set to zero and one, respectively. This will enable controlling PD14 Output and set its values to HIGH in a single command. Your LED5 should turn on. You can then keep toggling it by adjusting bit 161.
Boundary Scanning can be very powerful when testing PCBs automatically. With a little help of scripting you could easily check short and open circuits or test functionality of components on your PCBs.
References
[2] STM32F407xx datasheet, pp. 40
[3] ARM Serial Wire Debug Port
[4] What’s The Difference Between JTAG (IEEE 1149.1) And IJTAG (IEEE P1687)?
[5] IEEE 1687
[6] The Boundary-Scan Handbook. pp. 34-42
[7] STM32CubeIDE
[8] ST-Link/V2
[9] ARM-USB-TINY-H
[10] J-Link
[11] Embedded Trace Macrocells
[12] ARM CoreSight Architecture Specification Version 3.0
[13] Debug Access Port
[14] ARM Debug Interface v5 Architecture Specification
[15] ARM LLVM Toolchain
[16] Keil MDK Compiler
[17] Open-source Stlink repository
[18] Open-source Stlink supported boards
[19] pyOCD
[21] XJTAG
[22] Lauterbach
[23] Memfault: How to debug a HardFault on an ARM Cortex-M MCU
[24] STM32F407 Reference Manual – STM32F4xx JTAG TAP connection, pp 1686
Thanks everthing