High Performance Computing
![Cray YMP [ Cray YMP ]](/avg/pic/Cray_YMP.jpg)
Linux PCs
![[ Linux Workstation ]](/avg/img/HP_XW9400.png)
In high performance computing, hardware typically only has a life span of three years. Then performance demands are such that it must be replaced with something more powerful. The specialized UNIX™ engineering workstations have never found a use outside of engineering, so they had to be written off. The Linux workstations can live on a few more years, after their high performance computing duties are over, in less demanding positions, elsewhere in the company, where they may even run Microsoft Windows™. This allows CAE engineers and accountants to become friends 😉.
Cluster Computing
Common, off the shelf PC workstations can be combined into a high performance analysis cluster.
Each node may have multiple processors and each processor may have multiple cores.
![Computer Cluster [ Computer Cluster ]](/avg/img/ComputerCluster.png)
I built my first cluster with 4 dual 2.2 GHz Xeon HP systems and a gigabit switch. The second one comprised 8 IBM systems with dual 2.4 GHz AMD Opteron processors and an Infiniband™ switch. The latest one has 16 nodes and they all have two multi-core CPUs.
Job scheduling applications allow the 8-node cluster to be sub-divided into 4, 8, 16, 32 or
even 64 cpu cores. This gives flexibility to either run a single big job through as fast as
possible or run multiple jobs in parallel. Efficiency diminishes as more nodes are added,
so subdividing the cluster may be the most effective way to run multiple jobs in a shared
resources environment. The performance of the fastest compute clusters is measured with a
benchmark and posted on the Top500 web site for bragging rights.
CPU Configuration*) | Speed-up Factor |
---|---|
1 x 1 x 1 = 1 | 1.0 |
1 x 2 x 1 = 2 | 1.5 |
1 x 2 x 2 = 4 | 2.5 |
4 x 1 x 1 = 4 | 3.7 |
4 x 2 x 1 = 8 | 6.0 |
8 x 2 x 1 = 16 | 9.0 |
8 x 2 x 2 = 32 | 12.2 |
*) 4x2x1 denotes 4 nodes, with each 2 cpus, with each 1 core. |
An analysis cluster is typically kept in an air-conditioned server room with
other dedicated computers, such as the application, web, and database servers,
to dissipate the heat generated. The system administrator can access the computers there,
or with the right equipment, also from half way across the world. My set up, with
networked KVM switch, is such that an analysis cluster in Australia can be administred
from Michigan and monitored through a web browser. The status web page gives a quick
overview of the systems with colors indicating the status of the servers:
Black for systems that are off line;
Blue for systems that are available but idle;
Green for systems that are in use, but have reserve capacity (typically running at about 50%);
Orange for systems that are running at full capacity, using all CPUs and cores;
Red for systems that are overloaded which, if it becomes structural, indicates a capacity problem.
One click on the monitor icon gives me detailed information about the system.
![]() |
Cluster/DMP server | ![]() |
Cluster/DMP Server | ![]() |
Web server |
2x Xeon E5 2.6GHz | 2x Xeon E5 3.5GHz | 2x Opteron 3.2GHz | |||
![]() |
Cluster/DMP server | ![]() |
Cluster/DMP Server | ![]() |
Database server |
2x Xeon E5 2.6GHz | 2x Xeon E5 3.5GHz | 2x Opteron 3.2GHz | |||
![]() |
Cluster/DMP server | ![]() |
Cluster/DMP Server | ![]() |
Application server |
2x Xeon E5 2.6GHz | 2x Xeon E5 3.5GHz | 2x Opteron 2.8GHz | |||
![]() |
Cluster/DMP server | ![]() |
Cluster/DMP Server | ![]() |
Backup server |
2x Xeon E5 2.6GHz | 2x Xeon E5 3.5GHz | 2x Opteron 2.6GHz |
Likewise the user workstations can be anywhere within the local area network or, like mine,
in the wide area network. Their performance is of less importance, although good graphics
performance helps. My systems may have quad, hexa, or octa core Intel or AMD CPUs running at a variety
of clock speeds from 2.6 GHz to 3.5 GHz and all sport nVIDIA graphics.
Just because it seems that every workstation these days is based on x86_64 architecture and the Linux Operating Systems, doesn't mean that I haven't enjoyed the previous generation of UNIX™ based computer systems with RISC* architecture CPUs or the exotics from Cray, Alliant, Gould, and others before that. I did. From TNO in Delft, Netherlands we connected to the Cray-YMP of the SARA Center of the University of Amsterdam. One late night in 1990 I was connected from my home, through my Acorn R260 and 1200 baud modem, to TNO and via the "Internet" to the YMP. I typed in "who" to see who was all working on the computer that night. There was no-one. I had the Cray all to myself... That made my neck hairs stand up! 😎
*) RISC stands for Reduced Instruction Set Computer
All throughout the '90 I loved the Silicon Graphics workstations with their MIPS CPUs and performance graphics. I even had some myself.
Silicon Graphics Indigo R4000/Elan
Having enjoyed Silicon Graphics computers professionally from the first Iris 4D20 in 1989,
it was only a matter of time for one to occupy space in my own office.
The supply and demand lines finally crossed each other in January 1997.
My primary use of the Indigo was in software development.
However, it was also used for Internet access and DNS.
This "Purple Beauty" looked into the world through the desirable
20” Sony Trinitron, making the most of the ELAN graphics with 24 bitplanes
and hardware Z-buffer. It ran Irix 6.2 and boasted 64 MB RAM and 5 GB disc space.
Silicon Graphics Indy R5000
![Half Blue [ Half Blue ]](/avg/img/Indy.jpg)
Dell Dimension XPS T500
The Accountant wanted to run "Peachtree" and the Program Manager needed
access to industry standard presentation and program management software for compatibility
with the rest of the business world. So on the brink of the new millennium an
Intel/Microsoft Windows 98™ box was acquired.
For several years it ran the first line office duties. It was even upgraded to Windows XP™ Then it served as my test bed for various flavors of Linux and internet access. With its 500MHz Intel Pentium III processor it was just about powerful enough to do that. It had 640 MB of RAM, a 13.5 GB hard disk, a DVD, CDRW, a camera and microphone. It has long since been replaced by a sleek little laptop, that can do all that so much better, and stream live tv on top of that.
Acorn Archimedes R260
![Acorn R260 [ Acorn R260 ]](/avg/img/R260.gif)
The R260 faced the world through a 15” Sony Trinitron monitor. It had 8 MB RAM and 600 MB disc space. 10 years is a long timespan for a computer, which shows how far ahead the system was at release. Today the ARM chip that was the heart of the Acorn R260, lives on (as the "Snapdragon") in many PDAs, cell phones and calculators.
Acorn BBC model B
Ah, the venerable BBC-B computer. The one that taught me the love for computers, software,
and everything associated with it. With analog to digital converters, serial, parallel, and
8-bit I/O port it had all the interfacing capabilities for exciting hardware projects.
With its 8-bit 2 MHz MOS 6502 processor, linear memory, assembler, BASIC, Pascal, Forth, BCPL,
sound and graphics it had everything you needed to understand and enjoy computing and
learn how to program. It came with good games too. The most famous one being Elite!,
but there were many others that allowed stress relieve during software bug hunts.
The original system came with just 32kB RAM. I had dual 320 kB floppy drives and
(eventually) a 10 MB hard drive. It came with an expansion bus that allowed a co-processor
(with memory) to be attached to the base unit. These co-processors (I had the Intel 80186
as well as the NS 32016) were much more powerful and extended the life of the BBC-B all the way
into the nineties. There are still enthusiasts out there that keep them alive after some
forty years. Rightfully so! I pity the kid who has to get the love of computing from a
modern day PC or tablet. (fortunately, there is Raspberry Pi).
Casio FX-700P
The Casio FX-700P was my first computer, being that it was programmable in BASIC. I have had it
since 1983. It once boasted a sixth order (+ square root) polynomial RMS approximation program
that could calculate a NACA wing profile.
I remember it fondly for three reasons.
![Casio FX-700P [ Casio FX-700P ]](/avg/img/Casio.jpg)
- It was still holding together despite numerous dings and despite losing all original screws.
- I have never found a worthy successor.
- It had the sympathetic message "READY P0" written on the display.
If all else fails: The Slide Ruler!
No computer will ever enslave me!
![[ slide ruler ]](/avg/img/slideruler.jpg)
No computer will ever enslave me!
![[ Roark's ]](/avg/img/Roark-7th.jpg)
The table below lists some other computers I’ve met or owned and their relative CPU performance. They are all indexed against the DEC MicroVAX II using Digital Research Labs' benchmarking routines. For multi-processor systems the single CPU performance is listed, unless otherwise noted.
System Architecture and CPU | Operating System | MVUP1 | Year2 |
---|---|---|---|
DEC MicroVAX II with KA630-AA (78032/78132) @ 5 MHz | VMS | 1.0 | 1986 |
Acorn R260 ARM3/FPA10 @ 26 MHz | RISCiX 1.21c | 4.6 | 1989 |
SGI 4D20 Mips R2000A/R2010A @ 12 MHz (IP6) | IRIX 4.0.5 | 8.9 | 1989 |
SGI 4D25 Mips R2000A/R2010A @ 20 MHz (IP6) | IRIX 4.0.5 | 15.8 | 1989 |
Cray Y-MP | Unicos 7.0.5 | 194.1 | 1989 |
SGI 4D440 Mips R3000/R3010 @ 40 MHz (4 cpu's) (IP7) | IRIX 4.0.5 | 37.0 | 1991 |
SGI 4D35 Mips R3000/R3000 @ 36 MHz (IP12) | IRIX 4.0.5 | 31.7 | 1991 |
SGI Indigo Mips R3000/R3000 @ 33 MHz (IP12) | IRIX 4.0.5 | 28.2 | 1992 |
IBM RS6000 / 34H POWER Arch. @ 42 MHz | AIX 3.2.5 | 65.1 | 1993 |
Compaq / Intel 80386 DX 33 MHz, IIT 80C387 | Linux 0.99.11 | 2.6 | 1993 |
Cray C98/4256 | Unicos 7.C.3 | 243.4 | 1993 |
Sun 4c SPARK cpu + TI fpu @ 40 MHz | SunOs 4.1.1 | 23.0 | 1993 |
SGI Indigo XZ Mips R4000/R4010 @ 100 MHz (IP20) | IRIX 6.2 | 62.0 | 1993 |
SGI Indigo Extreme Mips R4400/R4010 @ 150 MHz (IP22) | IRIX 5.1.1.3 | 97.2 | 1994 |
DEC 3000_500 Alpha @ 100 MHz | OSF/1 1.2.10 | 97.2 | 1994 |
HP PA 9000/715 @ 50 MHz (PA-RISK 1.1) | HPUX A.09.0 | 69.8 | 1994 |
SGI Indy Mips R4600/R4610 @ 100 MHz (IP22) | IRIX 5.2 | 59.3 | 1994 |
SGI Indigo2 XZ Mips R8000/R8010 @ 75 MHz (IP26) | IRIX64 6.0.1 | 125.2 | 1994 |
Cray C90 | Unicos 8.0.3 | 341.9 | 1994 |
Cray J90 (4 CPUs) | Unicos 8.0.3 | 263.9 | 1995 |
SGI Indy Mips R5000/R5000 @ 180 MHz (IP22) | IRIX 6.2 | 204.1 | 1996 |
SGI Indigo3 Mips R10000 @ 195MHz (IP28) | IRIX 6.2 | 624.5 | 1996 |
HP C200 PA 9000/782 @ 200 MHz (PA-RISK 1.1) | HPUX B.10.20 | 269.8 | 1998 |
HP C240 PA 9000/800 @ 240 MHz (PA-RISK 1.1) (4 CPUs) | HPUX B.11.00 | 317.7 | 1999 |
Sun Ultra-Enterprise 500/6500 (4 CPUs) | SunOS 5.6 | 238.4 | 1999 |
Dell XPS T500 Intel Pentium III @ 500 MHz | Linux 2.2.14 | 412.9 | 2000 |
Compaq AP500 Intel Pentium III @ 550 MHz | Linux 2.2.14 | 455.2 | 2000 |
HP C550 PA 9000/785 @ 550 MHz (PA-RISK 2.0) | HPUX B.11.00 | 1304 | 2001 |
Compaq EVO W6000 Pentium III Xeon @ 2.2 GHz (2 CPUs) | Linux 2.4.7 | 1398 | 2002 |
Compaq EVO W6000 Pentium III Xeon @ 2.4 GHz (2 CPUs) | Linux 2.4.18 | 1456 | 2002 |
HP xw6000 Pentium III Xeon @ 2.8 GHz (2 CPUs) | Linux 2.4.18 | 1870 | 2003 |
Dell Inspiron 5150 Pentium 4HT @ 3.06 GHz | Linux 2.4.21-4 | 2370 | 2004 |
HP xw6000 Intel Xeon @ 3.2 GHz (2 CPUs) | Linux 2.4.21-9 | 2578 | 2004 |
IBM Intellistation AMD Opteron 250 @ 2.4 GHz (2 CPUs) | Linux 2.4.21-9 | 3996 | 2005 |
4 x 1 x 1 cluster AMD Opteron 250 @ 2.4 GHz (Infiniband)3 | Linux 2.6.9-45 | 14800 | 2006 |
4 x 2 x 1 cluster AMD Opteron 250 @ 2.4 GHz (Infiniband)3 | Linux 2.6.9-45 | 24000 | 2007 |
8 x 2 x 1 cluster AMD Opteron 250 @ 2.4 GHz (Infiniband)3 | Linux 2.6.9-45 | 36000 | 2007 |
8 x 2 x 2 cluster Intel Xeon 5160 @ 3.0 GHz (Infiniband)3 | Linux 2.6.18-53 | 49000 | 2008 |
HP xw9400 AMD Opteron 2380 @ 2.5 GHz (1 x 2 x 4)3 | Linux 2.6.38-63 | 18641 | 2011 |
HP z800 Intel Xeon X5570 @ 2.93 GHz (1 x 2 x 4)3 | Linux 2.6.38-63 | 29253 | 2014 |
HP z800 Intel Xeon X5650 @ 2.67 GHz (1 x 2 x 6)3 | Linux 2.6.38-63 | 35024 | 2017 |
HP z800 Intel Xeon X5687 @ 3.60 GHz (1 x 2 x 4)3 | Linux 2.6.38-63 | 43880 | 2019 |
HP z840 Intel Xeon E5-2687W v3 @ 3.10 GHz (1 x 2 x 10)3 | Linux 5.6.13-100 | 54640 | 2022 |
HP z840 Intel Xeon E5-2690 v4 @ 2.60 GHz (1 x 2 x 14)3 | Linux 6.8.9-100 | 73906 | 2024 |
- MVUP is MicroVAX Units of Processing
- The year listed signifies the year the system was benchmarked rather than the year it was first released.
- The latest computers and clusters are too fast for the DRL benchmark to be used any longer. We have switched to using explicit finite element solver based benchmarks and scaled the numbers back to MVUPs.