In CES 2015 Shang, inve up first throws has a star heavy bomb: Tegra X1, in released Shi, inve up xuancheng new of X1 processor of performance reached has generation K1 of twice times, this also on means with Tegra X1 became has today market Shang performance most tough of mobile processor, following please with we with view X1 what has more strong, and strong in where’s. iPhone 4 wooden case
First from inve up of old Bank GPU began analysis, early in GTC 2014 of when, inve up on claimed next generation Tegra processor will using Maxwell schema of GPU, and Maxwell schema early in desktop level of GPU Shang appeared had has, and inve up to put this schema put in mobile processor on is Xia has many Kung Fu, and Tegra K1 of Kepler GPU not as, X1 Shang of Maxwell GPU is new design starting from 0, not a casually transplantation works. wooden iPhone 4 case
Dang inve up decided will mobile processor business put in first bit of when, this company of ambitions visible as evidenced by, for Tegra,, high priority level of treatment means with inve up latest and most strong of GPU will to faster of speed landing mobile processor–Maxwell 1 of released and Tegra X1 of released just apart has a years, phase more Yu Kepler and K1 two years of time interval does short has many.
In addition, high priority means that NVIDIA will be exclusive from the architecture for mobile processor on the underlying power, and this is not only good for Tegra, for desktop-class GPU energy consumption also has a significant role.
So come, Tegra X1 is NVIDIA’s first product under this policy, it is also significant for the NVIDIA, helped by the product strategy, Tegra X1 has very powerful Tegra K1 was obtained on the basis of evolution, which evolved, many are benefiting from using Maxwell schema. In part of the CPU, NVIDIA decided to do on the market the most powerful CPU, NVIDIA also found the ARM to the A57 (though given the high-end CPU architecture for some time is the A57, so Tegra X1 ‘s biggest weapon is still frenzied’s GPU).
Further in-depth Tegra X1 of GPU, we by saw of is a star to Tegra and design of Maxwell-2 GPU, relative Yu Qian for of Kepler,Maxwell 2 schema joined has series of new function, which including has third generation of polygon color compression technology, each CUDA core of energy efficiency than also get has upgrade, other graphics aspects of function also including conservative type grating of algorithm, and stereo of cover resources and more frame anti-sawtooth,, Sounds like a cool feature everything bundled into Tegra X1.
In X1 among, inve up on memory bandwidth and general efficiency of improved in all improved among is most important of, because this two points Basic is mobile processor of bottleneck where, in for memory bandwidth of optimization Shang, mobile processor manufacturers for high-end mobile processor of practices often is on memory bus frequency (Memory Bus) for upgrade (upgrade to 96 bit or 128 bit), this simple gross of method certainly is effect best and most intuitive of, But upgrading memory bandwidth means upgrading costs and increasing complexity of mobile processors and peripherals, X1, NVIDIA still uses a 64-bit memory bus, so in order to keep the powerful GPU performance hungry, NVIDIA added a data compression, coupled with a LPDDR4 upgrade, X1 GPU performance can be given full play.
In addition mobile processor thermal design power (TDP) is also a limiting factor, is also big for the significant benefits: reduced processor power consumption as well as improve performance, control the heating also makes the processor performed better when working in continuous, which is why X1 Maxwell using 20 nm TSMC process to optimize power consumption.
Last but the most important part is X1 also has a mobile GPU-specific feature, this feature does not appear in the desktop GPU is, the function by NVIDIA called “Double Speed FP16”, after the inclusion of the feature, CUDA unit to FP16 reach higher levels of performance, which is useful in a part of the scenario.
And like Kepler and Fermi before, Maxwell has only specific FP32 and the FP64 CUDA cores, X1 is no exception, and after knowing the importance of FP16, X1 FP16 Mission in unique ways. Above K1, FP16 be FP32 and calls by simply upgrading to FP32 core processing, and X1 will be both FP16 packages the Vec2 together into a single package, and then handed over to a FP32 CUDA-core processing.
In a nutshell, the X1 can use the same process both FP16 package processing, after packaging, X1 for CUDA kernel more fully and more flexible.
Actually this also not what novelty of idea has, inve up of competition opponents early began so dry has, General for this processing method still somewhat opportunistic of taste, ARM and Imagination in now of GPU among are has FP16 of compatible capacity (or is has FP16 processing unit or more excellent of ALU deployment), and even AMD also to joined has, inve up so do is reasonable.
But where about the importance of FP16? This is a long story, simply put, FP16 widely exist in the show sort of Android, because for Android this low precision calculations are crucial to saving; Moreover, FP16 operation also in the mobile game space has a certain status, moreover, FP16 in image recognition application (such as NVIDIA’s own Drive PX platform).
Although FP16 also has its own limitations – 16-bit floating-point numbers are indeed not enough for now, but including the applications mentioned above, FP16 still has an important play, FP16 quickly and accurately is also important.
So much in functionality, the rest is speaking time with data.
Overall, the X1 GPU consists of two Maxwell SMM into a GPC Member, total number of CUDA 256, K1 SMX directly doubled, which means such as geometry and texture to the basics also doubled, X1 compared better CUDA cores and Kepler in the dust.
Except CUDA core number zhiwai, inve up also light gate ROP unit for has modified, X1 this back has has 16 a ROP, reached has K1 of four times times, and this ROP number also caught up with has GM107 of ROP number, this upgrade for X1 support 4K@60Hz is vital of, while upgrade of bandwidth management strategy (efficiency and actual bandwidth are has) also guarantee has these ROP in processing heavy task of when not hungry with.
Last, we also is inevitable to returned to has talk about clock frequency and expected performance aspects, inve up official temporarily also no announced X1 of GPU frequency, but according to they announced of performance data view, also is can guess out is clues of: inve up claimed X1 of FP16 processing capacity reached has 1TFLOPs, which projections, this star GPU of maximum frequency may has reached has 1GHz (1GHzx2FP16x2FMAx256=1TFLOPs).
This frequency level basic has is desktop computer level of has, and such of high frequency for a paragraph mobile processor for also has is very radical has, and for X1 eventually will to what form fell to consumers of hands still also is a unknown, currently only determine of is carrying Tegra X1 of equipment certainly not in short time within and we met (certainly for inve up home of products on does not necessarily has), such of a star bomb full speed running of when, Power consumption and heat dissipation is also an unavoidable problem.
Update: field performance tests byHardwarezone
3D Mark score 43241 points, A8X twice times for Apple.
GFXBench-run, frame burst.
Average power consumption figure, as a contrast to the Apple A8X 2.651 Watts average power consumption, X1 1.498 w of average power consumption, power consumption if the get a grip Tegra phone on X1 it is not out of the question.