Pull to refresh
107.75
Huawei
Huawei – мировой лидер в области ИКТ

Comparing Huawei ExaGear to Apple's Rosetta 2 and Microsoft's solution

Reading time7 min
Views3.7K
Original author: Armmaster

November 10, 2020 was in many ways a landmark event in the microprocessor industry: Apple unveiled its new Mac Mini, the main feature of which was the new M1 chip, developed in-house. It is not an exaggeration to say that this processor is a landmark achievement for the ARM ecosystem: finally an ARM architecture chip whose performance surpassed x86 architecture chips from competitors such as Intel, a niche that had been dominated for decades.

But the main interest for us is not the M1 processor itself, but the Rosetta 2 binary translation technology. This allows the user to run legacy x86 software that has not been migrated to the ARM architecture. Apple has a lot of experience in developing binary translation solutions and is a recognized leader in this area. The first version of the Rosetta binary translator appeared in 2006 were it aided Apple in the transition from PowerPC to x86 architecture. Although this time platforms were different from those of 2006, it was obvious that all the experience that Apple engineers had accumulated over the years, was not lost, but used to develop the next version - Rosetta 2.

We were keen to compare this new solution from Apple, a similar product Huawei ExaGear (with its lineage from Eltechs ExaGear) developed by our team. At the same time, we evaluated the performance of binary translation from x86 to Arm provided by Microsoft (part of MS Windows 10 for Arm devices) on the Huawei MateBook E laptop. At present, these are the only other x86 to Arm binary translation solution that we are aware of on the open market.

Since all solutions were originally created for different operating systems (Huawei ExaGear for Linux, Apple Rosetta 2 for MacOS, and Microsoft binary translator for MS Windows), we had to find an appropriate comparison method, as it was impossible to execute them under the same conditions. We chose a “translation efficiency” metric, which is the ratio between the native version of the binary and a version of the binary for the guest architecture executed using the binary translator. For our tests the native target platform is Arm and the guest platform is x86. In other words, we compared how many percent of the native indicator the benchmark execution achieves in binary translation mode. For Geekbench in which a higher score is better, this ratio is: translated score divided by native score. For Spec in which a lower execution time is better, this ratio is: the native execution time divided by the translated execution time. The same metric was used by experts from the Anandtech website, who published a review article about the Apple M1, where performance figures were also given for Rosetta 2.

All measurements are for a single thread, we are primarily interested in the performance of the translated code. That is, what percentage of performance is lost compared to native code.

Huawei Exagear Vs Apple Rosetta 2

So, let us start by comparing Huawei ExaGear with Apple Rosetta 2 on Apple MacBook Pro (M1). The Rosetta 2 tests run in the native MacOS BigSur 11.1, the ExaGear tests run in a virtual machine with Linux kernel 5.4.1.

Geekbench 5.4.1 (points, higher is better ):

Bench name

ARM64 MacOS

Rosetta 2 (x86)

Rosetta 2 efficiency

ARM64 Linux(VM)

Exagear (x86)

Exagear efficiency

AES-XTS

2769

1720

62.1%

2703

1823

67.4%

Text Compression

1502

1319

87.8%

1438

1349

93.8%

Image Compression

1356

1056

77.9%

1335

1230

92.1%

Navigation

1716

1678

97.8%

1717

1605

93.5%

HTML5

1642

1066

64.9%

1717

1302

75.8%

SQLite

1400

1000

71.4%

1328

1229

92.6%

PDF Rendering

1600

1324

82.8%

1738

1464

84.2%

Text Rendering

1778

1290

72.6%

1726

1480

85.8%

Clang

1661

1244

74.9%

1738

1335

76.8%

Camera

1612

1278

79.3%

1464

1148

78.4%

N-Body Physics

1789

1503

84.0%

1830

1616

88.3%

Rigid Body Physics

1772

1352

76.3%

1690

1176

69.6%

Gaussian Blur

1407

1251

88.9%

1435

1297

90.4%

Face Detection

2215

1500

67.7%

2216

1632

73.7%

Horizon Detection

1964

1268

64.6%

1879

1386

73.8%

Image Inpainting

3214

2893

90.0%

3345

2903

86.8%

HDR

2486

2250

90.5%

2761

2690

97.4%

Ray Tracing

2553

1970

77.2%

2055

1992

96.9%

Structure from Motion

1406

1068

76.0%

1507

1150

76.3%

Speech Recognition

1601

1371

85.6%

1485

1355

91.3%

Machine Learning

1243

713

57.4%

1229

727

59.2%

GeoMean

76.9%

82.4%

As we can see, Huawei ExaGear has a better average performance of 82.4% in comparison to Rosetta 2’s 76.9%, and ExaGear loses only in 4 tests out of 21: Navigation, Camera, Rigid Body Physics, and Image Inpainting.

Here we need to make a small, but quite interesting digression. A closer examination of the M1 processor reveals that despite the fact that it was officially released only in November, meaning that the internal design and instruction set was finalized long before, likely in 2019. Despite this it contains support for the Arm V8.7 architecture extension, not published until Autumn 2020. A significant portion of this extension aims to simplify and improve the performance of some operations common to binary translation for an x86 guest architecture. That is, Apple was developing a processor with an extension that was not official at the time. Moreover, a close look at earlier extensions reveals that ArmV8.5 and ArmV8.4 also included operations to support binary translation. This suggests that Apple has been working in close cooperation with Arm for quite some time, pursuing hardware support for their binary translation solution. Apple Rosetta 2 leverages all these features and therefore has a definite advantage over Huawei ExaGear, which does not exploit these extensions. 

We believe this is why Apple Rosetta 2 performs better in these four benchmarks. But nevertheless, Huawei ExaGear shows better results in other tests, despite being at somewhat of a disadvantage for not using these advanced architecture features.

It is also worth noting that the performance figures in native Arm-mode for MacOS and Linux are generally quite close, which confirms the general correctness of our approach of comparing the performance of binary translators. 

Next, let us compare performance on SpecCPU2006 and SpecCPU2017. The execution time of each subtest is measured in seconds. Benchmarks written in Fortran were excluded.

SpecCPU2006 (in seconds, lower is better)

Compiler: clang 11.0 -O3 –flto

INT tests

ARM64 MacOS

Rosetta 2 (x86)

Rosetta 2 efficiency

ARM64 Linux(VM)

ExaGear (x86)

ExaGear efficiency

400.perlbench

157

218

72.0%

145

166

87.3%

401.bzip2

248

326

76.1%

247

282

87.6%

429.mcf

106

118

89.8%

129

123

104.9%

445.gobmk

178

198

89.9%

183

193

94.8%

456.hmmer

159

170

93.5%

164

142

115.5%

458.sjeng

246

300

82.0%

253

281

90.0%

462.libquantum

94

107

87.9%

101

128

78.9%

464.h264ref

203

328

61.9%

201

271

74.2%

471.omnetpp

146

179

81.6%

173

193

89.6%

473.astar

184

203

90.6%

196

201

97.5%

483.xalancbmk

82

104

78.8%

96

112

85.7%

GeoMean INT

81.7%

90.8%

FP tests

433.milc

95

130

73.1%

98

127

77.2%

444.namd

139

172

80.8%

139

164

84.8%

447.dealII

108

117

92.3%

115

151

76.2%

450.soplex

91

104

87.5%

95

107

88.8%

453.povray

60

78

76.9%

52

65

80.0%

470.lbm

113

119

95.0%

114

99

115.2%

482.sphinx3

194

207

93.7%

195

215

90.7%

GeoMean FP

85.2%

86.7%

SpecCPU2017 (in seconds, lower is better)

Compiler: clang 11.0 -O3 –flto

INT tests

ARM64 MacOS

Rosetta 2 (x86)

Rosetta 2 efficiency

ARM64 Linux(VM)

ExaGear (x86)

ExaGear efficiency

500.perlbench_r

216

282

76.6%

210

241

87.1%

502.gcc_r

125

164

76.2%

125

153

81.7%

505.mcf_r

202

235

86.0%

217

231

93.9%

520.omnetpp_r

277

360

76.9%

295

324

91.0%

523.xalancbmk_r

166

204

81.4%

189

197

95.9%

525.x264_r

154

176

87.5%

161

189

85.2%

531.deepsjeng_r

175

221

79.2%

190

197

96.4%

541.leela_r

282

294

95.9%

285

277

102.9%

557.xz_r

275

319

86.2%

307

335

91.6%

GeoMean INT

82.7%

91.6%

FP tests

508.namd_r

111

134

82.8%

111

133

83.5%

510.parest_r

308

331

93.1%

304

336

90.5%

511.povray_r

211

320

65.9%

196

253

77.5%

519.lbm_r

121

153

79.1%

127

142

89.4%

526.blender_r

149

179

83.2%

163

175

93.1%

538.imagick_r

210

345

60.9%

227

271

83.8%

544.nab_r

156

186

83.9%

150

174

86.2%

GeoMean FP

77.7%

86.1%

As we can see, the results from SpecCPU2006 and SpecCPU2017 follow that of Geekbench: on average, Huawei ExaGear beats Apple Rosetta 2, although in a few subtests Rosetta 2 is able to outperform ExaGear. 

Huawei ExaGear Vs Microsoft binary translator

Next, we will compare the performance of Huawei ExaGear against the binary translator from Microsoft using a Huawei MateBook E.

ExaGear runs in WSL environment, and Microsoft’s translator runs natively under Windows 10.

When running ExaGear within a WSL environment, there is an issue in that the timers we require for profiling are not available. This prevented use of full code optimization by ExaGear. We estimate that this diminishes performance by around 10-20% depending on the benchmark.

Geekbench 5.4.1 (points, higher is better):

Bench name

ARM64 Windows

MS BT (x86)

MS BT efficiency

ARM64 WSL

ExaGear (x86)

ExaGear efficiency

AES-XTS

872

437

50.1%

892

437

49.0%

Text Compression

514

381

74.1%

518

451

87.1%

Image Compression

577

328

56.8%

606

433

71.5%

Navigation

402

393

97.8%

522

502

96.2%

HTML5

517

224

43.3%

522

371

71.1%

SQLite

534

240

44.9%

565

412

72.9%

PDF Rendering

515

253

49.1%

574

462

80.5%

Text Rendering

530

264

49.8%

544

430

79.0%

Clang

485

187

38.6%

601

405

67.4%

Camera

437

221

50.6%

479

308

64.3%

N-Body Physics

390

253

64.9%

390

317

81.3%

Rigid Body Physics

634

299

47.2%

717

485

67.6%

Gaussian Blur

279

217

77.8%

282

238

84.4%

Face Detection

598

301

50.3%

657

387

58.9%

Horizon Detection

484

235

48.6%

542

331

61.1%

Image Inpainting

580

419

72.2%

775

504

65.0%

HDR

637

481

75.5%

1035

843

81.4%

Ray Tracing

758

296

39.1%

762

513

67.3%

Structure from Motion

379

203

53.6%

457

289

63.2%

Speech Recognition

346

283

81.8%

355

312

87.9%

Machine Learning

252

108

42.9%

240

128

53.3%

GeoMean

55.6%

70.9%

SpecCPU2017 (in seconds, lower is better)

Compiler: clang 11.0 -O3 -flto

INT tests

ARM64 Windows

MS BT (x86)

MS BT efficiency

ARM64 WSL

ExaGear (x86)

ExaGear efficiency

500.perlbench_r

824

1526

54.0%

800

1042

76.8%

502.gcc_r

716

1119

64.0%

690

820

84.1%

505.mcf_r

781

1069

73.1%

792

992

79.8%

520.omnetpp_r

1280

2025

63.2%

1186

1414

83.9%

525.x264_r

460

771

59.7%

481

638

75.4%

531.deepsjeng_r

507

711

71.3%

524

629

83.3%

541.leela_r

564

897

62.9%

547

652

83.9%

557.xz_r

733

938

78.1%

733

827

88.6%

GeoMean INT

65.4%

81.9%

FP tests

508.namd_r

423

698

60.6%

429

619

69.3%

510.parest_r

1006

1274

79.0%

970

1179

82.3%

511.povray_r

808

1540

52.5%

761

1027

74.1%

519.lbm_r

428

767

55.8%

429

549

78.1%

526.blender_r

474

688

68.9%

484

716

67.6%

538.imagick_r

560

1075

52.1%

566

847

66.8%

544.nab_r

522

1050

49.7%

517

617

83.8%

GeoMean FP

59.0%

74.3%

Unfortunately, we were unable to build SpecCPU2006 for MS Windows. On the other hand, SpecCPU2017 built successfully and results were able to be collected. As seen when comparing against Rosetta 2, the results of SpecCPU2006 and SpecCPU2017 are broadly similar.

Conclusion

Apple’s engineers have not only produced an outstanding and game changing processor. Furthermore, they have equipped this with a performance x86 binary translator. Nevertheless, our tests show that Huawei ExaGear confidently outperforms Apple’s Rosetta 2.

Microsoft’s solution is shown to be inferior to both ExaGear and although we can’t directly compare, also to Rosetta 2. This is expected given Microsoft’s lack of expertise and experience in developing binary translators.

Tags:
Hubs:
Total votes 2: ↑2 and ↓0+2
Comments0

Articles

Information

Website
huawei.ru
Registered
Founded
1987
Employees
over 10,000 employees
Location
Россия