Microboot on the ATtiny85

The last post gave an overview of the protocol and tools used with the bootloader I've developed - this post describes the implementation details for the version I have running on the ATtiny85. All the code is available on GitHub and released under a Creative Commons Attribution-ShareAlike 4.0 International License.

Hardware Requirements

The bootloader requires two pins, one to determine if the bootloader should enter programming mode and the other to implement serial communications. These pins are only used when the bootloader code is running, the application is still free to use them for whatever purpose it wants (with some minor restrictions). The details on the serial port will be described a little later on in the post, first I'll explain the purpose of the bootloader entry pin and how it is used.

When the chip is first powered on the bootloader code will start running, at this point it needs to decide if it should enter bootloading mode or simply hand over control to the application code. To make this decision it inspects the state of an input pin, if it is held low then it initialises the serial port and waits for commands, if it is not low the the pin is restored to it's default state and the application code is started.

To minimise the external hardware required I use the internal pull up resistor on the input pin so all that is required is a push button or jumper switch to tie the pin to ground before powering up the device. In the default configuration I use pin B3 (physical pin 2 on the 8 pin DIP version of the chip) for this purpose but you can change that to whatever you like.

If you are using this pin in your application you need to make sure that it is not held low when power is applied and that having it raise high (through the pull-up) is not going to trigger whatever circuitry you have attached to it.

Single Pin UART

One Pin UART

The UART used for communication is implemented in software (it's the same design I used in the Safety Light project) and uses a single pin for both Tx and Rx with a small amount of external circuitry (shown to the right). The design and code came from this site and was originally developed by Ralph Doncaster.

The default configuration uses pin B5 (physical pin 1 on the 8 pin DIP version of the chip) which is usually the RESET pin. Having support for a serial bootloader negates the need for SPI programming so the RESET pin is simply consuming a valuable resource to no real benefit. Disabling the RESET function and using it as the serial interface means that you can use an extra IO pin in your application (and given that the serial circuitry is already attached to it you may as well use it as a serial port).

Pin Assignments and Fuses

The pin assignments are done in the file 'hardware.h' in the source directory, you can simply change the assignments there if you want to use different pins to those that I have selected. If you do use the pins I have selected you will need to ensure the RSTDISBL fuse is programmed so the RESET pin can be used as general purpose IO.

To allow writing to the flash from the bootloader code you must have the SELFPRGEN fuse programmed as well, this is not optional if you want the bootloader to work at all. The fuse bytes I use are:

 This set of fuses runs the chip at 8MHz using the internal RC oscillator, sets the brown out detection level to 2.7V, enables self programming and disables the RESET pin. Here is the sequence of commands I use to build the bootloader, burn it to the chip and set the fuses:

```     ./makeall.sh     make MCU=attiny85 flash     make MCU=attiny85 fuses

> **WARNING**: Using this combination of fuses means that you will not be able to re-program the chip using SPI - you will need a high voltage programmer (or [similar tool](http://mdiy.pl/atmega-fusebit-doctor-hvpp/?lang=en)) to reset the chip to factory defaults. The sequence of command is important as well - if you change the fuses first you will not be able to flash the code (the fuse settings will disable RESET and prevent you from entering SPI programming mode).

 The targets in the Makefile are set up to use [USBasp programmer](http://www.fischl.de/usbasp/) or one of its many clones, if you have a different programmer simply edit the *avrdude* command in the Makefile and change the command line parameters appropriately.

# Reading Flash Pages

 Reading the flash memory is done with the *lpm* instruction on the AVR. The standard C library that is part of *avr-gcc* includes the [pgmspace.h](http://www.nongnu.org/avr-libc/user-manual/group__avr__pgmspace.html) utility library so I'm just using that for simplicity. To process the read command I simply use the '*pgm_read_byte_near()*' function to fill a buffer with data from the specified address in flash and send it back.

```     bool readFlash() {       uint16_t address = ((uint16_t)g_buffer[0] << 8) | g_buffer[1];       for(uint8_t index = 0; index<DATA_SIZE; index++, address++)         g_buffer[index + 2] = pgm_read_byte_near(address);       return true;       }

 It's not the most efficient way of doing things but it's simple, easy to understand and it works.

# Writing Flash Pages

 Writing to the flash is a bit more complicated. The entire process is described in detail in the [datasheet](http://www.atmel.com/devices/ATTINY85.aspx), I'll just give a run down of the process here.

 The program flash is accessed in pages (on the ATtiny85 each page is 32 words or 64 bytes in size) and the write operation must update an entire page at a time - you can't simply write byte by byte the way I handle the read operation.

 The boot loader protocol is currently using 16 byte data packets so we only have that much data to write, we need to keep the rest of the flash contents the same. The sequence I use is:

1. Copy the appropriate page from flash to a memory buffer.
2. Update a portion of the memory buffer with the data provided to the bootloader.
3. Erase the flash page in question
4. Write the memory buffer (a combination of the original contents with the new data) into the flash page.

 Apart from preparing the memory buffer the code I use to do this is essentially the sample provided in the datasheet.

```     bool writeFlash() {       uint16_t page_address, address = (((uint16_t)g_buffer[0] << 8) & 0xFF00) | g_buffer[1];       uint8_t page_hi, page_lo, index, written = 0;       while(written<DATA_SIZE) {         page_address = ((address / SPM_PAGESIZE) * SPM_PAGESIZE);         page_hi = (page_address >> 8) & 0xFF;         page_lo = page_address & 0xFF;         // Read the page into the buffer         for(index=0; index<SPM_PAGESIZE; index++)           g_pagecache[index] = pgm_read_byte_near(page_address + index);         // Add in our data         uint8_t offset = (uint8_t)(address - page_address);         for(index=0;(written<DATA_SIZE)&&((offset + index)<SPM_PAGESIZE);written++,index++)           g_pagecache[offset + index] = g_buffer[written + 2];         // Write the page         asm volatile(           // Y points to memory buffer, Z points to flash page           "  mov   r30, %[page_lo]                     \n\t"           "  mov   r31, %[page_hi]                     \n\t"           "  ldi   r28, lo8(g_pagecache)               \n\t"           "  ldi   r29, hi8(g_pagecache)               \n\t"           // Wait for previous SPM to complete           "  rcall wait_spm                            \n\t"           // Erase the selected page           "  ldi   r16, (1<<%[pgers]) | (1<<%[spmen])  \n\t"           "  out %[spm_reg], r16                       \n\t"           "  spm                                       \n\t"           // Transfer data from RAM to Flash page buffer           "  ldi   r20, %[spm_pagesize]                \n\t"           "write_loop:                                 \n\t"           // Wait for previous SPM to complete           "  rcall wait_spm                            \n\t"           "  ld    r0, Y+                              \n\t"           "  ld    r1, Y+                              \n\t"           "  ldi   r16, (1<<%[spmen])                  \n\t"           "  out %[spm_reg], r16                       \n\t"           "  spm                                       \n\t"           "  adiw  r30, 2                              \n\t"           "  subi  r20, 2                              \n\t"           "  brne  write_loop                          \n\t"           // Wait for previous SPM to complete           "  rcall wait_spm                            \n\t"           // Execute page write           "  mov   r30, %[page_lo]                     \n\t"           "  mov   r31, %[page_hi]                     \n\t"           "  ldi   r16, (1<<%[pgwrt]) | (1<<%[spmen])  \n\t"           "  out %[spm_reg], r16                       \n\t"           "  spm                                       \n\t"           // Exit the routine           "  rjmp   page_done                          \n\t"           // Wait for SPM to complete           "wait_spm:                                   \n\t"           "  lds    r17, %[spm_reg]                    \n\t"           "  andi   r17, 1                             \n\t"           "  cpi    r17, 1                             \n\t"           "  breq   wait_spm                           \n\t"           "  ret                                       \n\t"           "page_done:                                  \n\t"           "  clr    __zero_reg__                       \n\t"           :           : [spm_pagesize] "M" (SPM_PAGESIZE),             [spm_reg] "I" (_SFR_IO_ADDR(__SPM_REG)),             [spmen] "I" (SPMEN),             [pgers] "I" (PGERS),             [pgwrt] "I" (PGWRT),             [page_hi] "r" (page_hi),             [page_lo] "r" (page_lo)           : "r0","r16","r17","r20","r28","r29","r30","r31");         // Update addresses         address = address + written;         }       return true;       }

 It's a fairly straight forward process. The downside in using a smaller data size is that each flash page is written four times to fill the page. Increasing the size of the data block in the protocol would help with this. In future revisions I might increase the size to alleviate this issue.

# Bootloader and Application Startup

 The ATmega series of chips have direct support for bootloaders - programming the appropriate fuse bits will cause execution to start at the bootloader start address instead of using the code at the RESET vector to start the application. Unfortunately this feature is not available on the ATtiny85 so we have to resort to a little bit of trickery to get the same result.

 The boot loader is positioned in the upper 1K of memory, starting at address 0x1C00 (unless otherwise specified I will be using the byte address rather than the word address). When the ATtiny starts up it starts execution at 0x0000 which is the RESET vector. When you load a new program into the flash it will insert an '*jsr*' (jump relative) instruction to jump to the start address of the application. We want the CPU to go directly to the bootloader rather than the application so we make a few changes to the contents of the data being loaded into the flash before actually transferring it.

 The byte code for the '*jsr*' instruction is 0xC*nnn* where *nnn* is one less than the word address of the application entry point - a code of 0xC00F would indicate that the start of the application is at 0x0010 (word address, the byte address would be 0x0020). The '*mbflash.py*' utility I wrote handles this when the device type is an ATtiny85 - it looks for the signature of the '*jsr*' instruction at address 0x0000 and, if found, replaces it with a different '*jsr*' instruction that will jump to the bootloader program. The start address of the application program is stored at the top of flash memory in the two bytes just before the bootloader code itself. When it's time to launch the application the bootloader loads that information in the Z register and uses the '*ijmp*' instruction to call it.

```     addr_hi = pgm_read_byte_near(TOP_ADDRESS);     addr_lo = pgm_read_byte_near(TOP_ADDRESS - 1);     asm volatile(       // Z points to the application start address       "  mov   r30, %[addr_lo]                     \n\t"       "  mov   r31, %[addr_hi]                     \n\t"       "  ijmp                                      \n\t"       :       : [addr_hi] "r" (addr_hi),         [addr_lo] "r" (addr_lo)       : "r30","r31");

 The flashing utility will refuse to transfer the code if it can't find the '*jsr*' instruction or if the code to be loaded would overwrite the storage location of the actual application start address.

# Next Steps


 The [next post](/attiny85-breakout-board/) will detail the small breakout board I made for the ATtiny that works well with this bootloader for very rapid development and testing. I've made a few minor changes to the design based on how I found myself using - once I finish those off I'll document the whole thing.

 I'm hoping the bootloader will get some use externally (even if only as sample code for writing your own) - if there are any questions about it please go ahead and bring them up in the comments. If you find any bugs or have feature requests just raise them on the [GitHub project page](https://github.com/thegaragelab/microboot/issues).