Full-length report about the level shifter problem |
What follows is the detailled background report about the killjoy of this summer...
It all began with the fact that most of the finished boards wouldn't even start, after they had arrived at Medusa. The few that did consumed so much power that they caused the generously sized power supplies to fail. We don't want to bore you now with the details of how we located the problem and debugged the hardware, but instead skip to the description of the actual problem. After many tests it appeared (and was later confirmed) that the issue was with the level shifters of the FireBee. These 6 level shifters per board (positions U25 and U31 to U35) are small chips that convert the power and create all the special voltages that are needed by the various different ICs, e.g. 3.3 Volt, 1.45 Volt, 1.8 Volt etc. These are directional 8-path voltage converters from Texas Instruments. It is a "standard" model: 74LVC(8T)245. It's needed because back in the Atari days, 5V were the standard voltage in computers, but the more modern FireBee electronics use 3.3V (or, as mentioned, even lower voltages). The casing we use is an "QFN" (Quad Flatpack No leads, 4-side casing without legs) with an exposed pad, i.e. a metal surface on the lower side. This was supposed to be used as a "Thermal Pad", in order to improve the cooling of the ICs through the circuit board.
These so-called "Thermal Pads" therefore have to be soldered onto the board completely flat in order to properly dissipate the heat generated when converting the voltage. The schematics of this IC (also called "level converter") from Texas Instrument does not show any electric connection of the lower surface of the "exposed thermal pad" with any other structure on, or inside, the chip. The specification only states that the thermal pad "must" be soldered on. For this reason, the FireBee was designed to use the 3.3V part for the cooling. An unconventional and really pretty innovative idea that would help save resources and space on the curcuit board. The prototypes also confirmed that this worked fine. The "conservative" approach would have been to solder the "thermal pads" of these six voltage converters on GND.
What happened with the second batch is that the resistance of the level converters started to change during operation. When new, the resistance between the "exposed thermal pad" and GND is infinite. When the parts were soldered off and brand-new ones were bought by MCS and used for verification, resistance was a few KiloOhm while running at 25°C. When the temperature rose to a 100°C, the level fell to a few hundred Ohm. Sometimes the bus-drivers heated up to over a 110°C... kind of a vicious circle, really: directly after switching on the computer, the board started to change. The warmer it got, the lower the resistance was, and the more power would flow, and it would heat up even more.
So without any external influence, the behavior of the computers changed under load. You can probably imagine what that meant for the first few weeks of debugging the hardware. A system with almost 1000 parts on an 8-level multilayer board, which had been proven to work fine, and on which nothing had been changed, but still suddenly started to do what it wanted and behave differently every few seconds. In retrospect, that was a great episode from the E-tech horror shop... but in July, without any idea what the problem was, not funny at all!
The final assumption on our side is that TI changed the "diebond material" (glue in the chip) in their IC production, which is used to connect the IC-structure with the thermal pad internally. Some of these glues can become conductive when heated. Unfortunately, we could only get a brief statement from TI about the thermal pad supposedly still "not being connected internally".
The problem clearly lies with TI. The company has changed the physical properties of the ICs somehow, and at some point in time, but not documented these changes anywhere. Until today, and despite intensive searches for PCNs (Product Change Notes) on various channels, we were unable to find any information about it. So the current chip is, officially, exactly the same as in the previous batches - but it is objectively not true. We were also unable to find out when exactly the changes occurred - with which new chip series, or in which production period.
A hardware developer who joined the discussion during the analysis and offered his help, put it this way: "MCS has worked to the best of their knowledge and should IMHO not be the ones paying up for the costs". All of this also explains why the assembly company is also not to be blamed for this whole misery. Despite our announcement to fix the problem free of charge, costs of almost 3000 Euro have been amassed. The reason being that the assembly company now has to remove the six bus-drivers, glue some isolating but heat-dissipating Kapton tape on the lower surface of the thermal pads, and then has to solder them back on - all of this for a 3-digit number of boards. And all of this despite the fact that they also did nothing wrong and caused no production problems at all. Anyone who is familiar with motherboard production knows that this custom fixing process, for such a large number of boards, and at that price, is absolutely amazing! Our assembly company also has been very forthcoming and supportive during the search for the cause of the problems, and when communicating with the vendors and manufacturers. A great service, and an exemplary collaboration!
We hope that this explains why we are asking you for donations. We, as a project, don't want that one single person who already had invested a lot of their spare time to make the FireBees available at the production cost of the hardware, now has to pay up for all the costs. On the other hand, we also don't want to raise the price of the boards, since you pre-ordered them at a specific price. And the assembly company has no fault in any of this, as described before. Finally, any attempt to try to get money from a large semiconductor manufacturer due to problems caused by their faulty documentation - even though the first batch and the prototypes behaved completely differently - is obviously futile. So, for all these reasons, we would again ask you kindly to show solidarity and to foot up the bill together.
Thanks also again for all the trust you have put in Medusa, and the support you gave us during the delivery delays and problem analysis!
Donations please as described either by PayPal to
mcs (at) kingx (dot) com
or by wire transfer to:
Bank account Nr.: 202-805498.40F
IBAN: CH22 0020 2202 8054 9840 F
"reason for payment": FB series 2 donation
||JoeIron :: 2016-12-09 07:35:38|
Forgot to mention "donation" in PayPal transaction....
||JoeIron :: 2016-12-09 12:38:11|
A friend of mine has just drawn my attention of the documentation of the same chip produced by NXP contains the following information about the thermal pad: "This is not a supply pin, the substrate is attached to this
pad using conductive die attach material. There is no
electrical or mechanical requirement to solder this pad
however if it is soldered the solder land should remain
floating or be connected to GND"
||Mathias :: 2016-12-09 14:45:52|
Hey JoeIron, thats right for the NXP chips. But the docs for our TI ones only say "must be soldered" and "not connected", … and finally it worked for the prototypes and the first series, that´s the crazy situation.
Thanks for the donation, and no problem about the missing mention.
||JoeIron :: 2016-12-09 15:12:13|
I hope this error will not manifest itself in the first batch as time goes on...
||Xata :: 2017-09-14 21:03:11|
Sometimes you can make vias to another side from under the thermal pad and create a cooling plane there. But that is moving everything around, new gerbers, new pcbs and testing
Mailsystem reorganized |
We have migrated our complete mailsystem and all communication problems should be ...
Shipping and new Orders |
About the situation and availability of the hardware
FireBee on GitHub |
After some long months with no public access to our Firebee sources, we finally moved ...