| RTX 3070 | RTX 2080 Ti |
GPU | GA104-300 | TU102-300-K1-A1 |
Interface | PCI Express 4.0 | PCI Express 3.0 |
CUDA cores | 5,888 | 4,352 |
Tensor cores | 184 | 544 |
RT cores | 46 | 68 |
Base clock | 1,500MHz | 1,350MHz |
Boost clock | 1,725MHz | 1,545MHz |
Memory | 8GB GDDR6 | 11GB GDDR6 at 14Gbps |
Memory speed | 14Gbps | 14Gps |
Memory interface | 256-bit | 352-bit |
Bandwidth | 448GBps | 616GBps |
TDP | 220W | 260W |
A forma de contar cuda cores não parece ser mais a mesma entre Turing e Ampere. Para responder a perguntra do colega acima, só esperando os reviews pra ver mesmo.
Uma analise que achei no videocardz que parece ter sentido.
"Clear some thoughts on Ampere's CUDA cores:
Ampere's SM is the same as Turing's, with two major changes (excluding cache upgrades): Improved Triangle intersection performance on RT Fixed function units, and the Integer math pipelines are now capable of Floating Point math, too. The actual number of pipelines in each SM remains the same, at 128. By nvidia's own numbers,
for every 100 instructions, around 60 are Float and 40 are Integer (probably BEST case for Integer). This already speaks volumes about the "doubled CUDA core" on Ampere. In reality, the shader performance of an Ampere SM is only (peak) 2X higher in Float 32 than Turing, around 60-70% of the instructions used. The Integer throughput remains the same (per SM).
RTX 3080 has 8704 "advertised CUDA cores". In reality, it has 4352 pipelines capable of Floating point math, and 4352 pipelines capable of either Integer OR floating point math. In pure float mode, this is 128 Float ops / clock (not including FMA), but the SM may also operate in "concurrent mode" in which case the
execution is INDENTICAL to Turing, with 64 Int and 64 float, concurrently.
RTX 2080 Ti, for comparison, has 4352 "Advertised CUDA cores". In reality, it has 4352 floating point pipes and 4352 intpipes, and runs in INT + Float for concurrent work.
RTX 3070 has 5888 "Advertised CUDA cores", in reality it has 2944 float pipes and 2944 Int OR Float pipes. While this means 5888 Float ops/clock across the entire GPU (not including FMA) it actually has significantly less integer shading throughput to the RTX 2080 Ti, with only 2944 Integer pipes vs 4352. In Concurrent mode, the 3070 is identical to the RTX 2080 with 2944 Int + 2944 Float.
RTX 3070 will not be significantly faster than the RTX 2080 ti, and it will not be universally faster. It will beat the 2080 Ti in games leaning towards float shaders in raw throughput; but with deficits in raw memory bandwidth, pixel throughput and (potentially, though unlikely; TU104 had 6 Raster, so GA104 likely does) geometry throughput.
Overclocked RTX 2080 Ti will blast past RTX 3070 (even overclocked) and this will essentially be a similar situation to GTX 1070 vs GTX 980 Ti, in that the 1070 at stock, was marketed as faster than the 980 Ti, which was true (albeit slightly), but with OC, the 980 ti was much faster, even approaching 1080 speeds. Even though RTX 2080 Ti hasn't the huge OC headroom, it will gain more from overclocking on average. All comparisons vs the RTX 2080 Ti made by Nvidia will be against the stock, reference Turing FE model which is known to run at around 1800 MHz tops."
A mid-range card with 8GB memory. NVIDIA GeForce RTX 3060 Ti The first mid-range Ampere card to launch this year is GeForce RTX 3060 Ti, likely to be the first SKU in these series. We now have two sources confirming that the next graphics card in the Ampere GeForce RTX 30 series is RTX 3060 […]
videocardz.com