TWI module seems buggy in multi-master communications I have a project with 2 ATMega48 both configured as TWI master and slave. I noticed that after a couple hours, the TWI modules stop working and I have to reset the AVR's in order to resume operation. I cleaned up my application to the bare minimum in order to test this issue and increased the communication rate on the bus. I now can crash the modules in seconds. Both AVR have the same program loaded except the slave and target addresses are swapped. There's a random delay before issuing a start condition so that both have equal chances to be master. After analysis, it seems to me that there's a bug in the TWI hardware module when both masters are configured simultaneously to send a start. (TWSTA set in TWCR) Here are some screenshots of the signals: D0 is the I2C data (SDA); D1 is the I2C clock (); D2 is the signal of the first AVR, this is set when TWSTA is set and cleared at the first interrupt; D3 is the signal of the first AVR, it is set at the beginning of the interrupt and cleared at the end; D4 is the the signal of the first AVR, it is set when TWSTA is set in master mode or when the first interrupt occurs in slave mode; D5 is the signal of the second AVR, this is set when TWSTA is set and cleared at the first interrupt; D6 is the signal of the second AVR, it is set at the beginning of the interrupt and cleared at the end; D7 is the the signal of the second AVR, it is set when TWSTA is set in master mode or when the first interrupt occurs in slave mode. Figure 1. shows the I2C bus that hangs after a few seconds of correct communications. During this period, both AVR's are master/slave by turns. Most of the time both set TWSTA around the same time but because one is slightly before the other, one becomes master and the second slave. Figure 2. shows such a communication where the first AVR sets TWSTA 7.60µs before the second. The first pulse of D3 is the of the first AVR which writes the address of the target. Because the bus was busy when TWSTA was written in the second AVR, no start condition could be issued by this AVR so D5 is only unset at the first interrupt which is when it's slave address has been recognized. Then 4 bytes of data are exchanged and the last pulse of D6 is the indicating a STOP condition detection. Figure 3. is a zoom of the start bit. We can see the 7.60µs delay between both TWSTA are set.
SDA TWI hangs here Figure 1. The TWI module stopped working after a few seconds of successful communications SDA Figure 2. A correct communication, both masters set TWSTA but the quickest becomes master (first AVR) and the slowest becomes slave (second AVR)
SDA Figure 3. Delay between both TWSTA are set ( signal) Figure 4. shows the last communication that happened before the bus hanged. Notice the 2 glitches after the stop condition. I get those glitches each time the bus hangs and only just before it hangs, never in all other data transmissions. From Figure 5. we can see that this time the second AVR was 2.0µs earlier than the first AVR. The transmission continues correctly except the glitches. From Figure 1. we can see that after that last transmission the bus lines seems to be released and both AVR's are again setting their TWSTA but nothing happens after that, no start condition is issued. I guess that the 2 glitches crash the hardware TWI module somehow but I have no idea why those happen. From repetitive measurements, I could determine that the condition to get those glitches is that both TWSTA are set within approximately 5µs. I also monitored TWSR during all transmissions and everything is normal. The last status I got on the slave AVR are 0x60 (slave detected), 0x80 (data received) 4 times and 0xA0 (stop detected).
Figure 4. Last transmission, there are glitches after the stop condition and from that point the bus freezes Figure 5. Zoom of the start condition, both TWSTA are set within 2.0µs, the second AVR is quicker
Figure 6. Zoom of the stop condition and the glitches Contact: David Bourgeois david-at-jaguarondi.com