55 private links
In this 10-minute video, one of the authors summarizes the issues with silent silicon data corruption presented in their »Cores that don't count paper«.
Version 2.0 of the NVM Express specification has been released.
Paper by Google on mercurial processor cores that cause computational errors that were not detected during manufacturing tests.
The announced 2nm development will deliver the same performance with 75% of the energy, compared to modern 7mn processors.
»In 1990, Europe accounted for about 44% of global semiconductor manufacturing. Now, it's closer to 10% and Taiwan, South Korea and Japan account for about 60% of production…«
YouTube will gain up to 33 times the performance with its custom-built video transcoding units compared to optimized software on traditional servers.
Article about Microsofts first production-environment deployment of two-phase liquid immersion cooling in a data center.
Article how with AMD PSB enabled, CPUs are locked to a vendor ecosystem.
In this blog post, the author shows some ways in which money is wasted on IT infrastructure.
Today’s software systems are arguably robust at logging and recovering from fail-stop hardware – there is a clear,binary signal that is fairly easy to recognize a and interpret. We believe fail-slow hardware is a fundamentally harder problem to solve. It is very hard to distinguish such cases from ones that are caused by software performance issues. It is also evident that many modern,advanced deployed systems do not anticipate this failure mode. We hope that our study can influence vendors, operators, and systems designers to treat fail-slow hardware as a separate class of failures and start addressing them more robustly in future systems.
Exciting times are ahead for us. We expect that our Zion, Kings Canyon, and Mount Shasta designs will address our growing workloads in AI training, AI inference, and video transcoding respectively.