Palantir shares rise 5% premarket as AI-fueled demand powers annual guidance raise
On Tuesday, 10 June 2025, NVIDIA Corporation (NASDAQ:NVDA) participated in Rosenblatt’s 5th Annual Technology Summit - The Age of AI 2025. The conference highlighted NVIDIA’s strategic focus on networking as a key driver in the evolution of AI data centers. While the company emphasized its innovative solutions, challenges in transitioning from copper to optical networking were also discussed.
Key Takeaways
- NVIDIA’s networking strategy is pivotal for AI data centers, with a focus on scale-up and scale-out networking.
- InfiniBand remains crucial for AI supercomputers, while Spectrum X caters to broader AI markets.
- Transitioning from copper to optical solutions is essential for future data center connectivity.
- NVLink Fusion is introduced for custom accelerators leveraging NVIDIA’s infrastructure.
Operational Updates
- Scale-up Networking: Expanded to support up to 72 GPUs this year.
- AI Frameworks: Continuous evolution necessitates new compute engines and infrastructure.
- Networking Solutions: Customers can choose between Ethernet and InfiniBand based on their needs.
- Scale-out Networking: Spectrum X supports hundreds of thousands of GPUs for single workloads.
- NVLink Fusion: New offering for custom accelerators to utilize NVIDIA’s infrastructure.
- Optical Networking: Essential for scale-out infrastructure, requiring six transceivers per GPU.
- Co-packaged Silicon Photonics: Enhances GPU density by three times with the same ISO power.
Future Outlook
- Workloads: Evolving workloads continue to drive the need for advanced computing solutions.
- GPUs: Discussions are ongoing about scaling to a million GPUs.
- Data Center Growth: Rapid increase in data center size anticipated.
Q&A Highlights
- InfiniBand vs. Ethernet: Choice depends on customer familiarity, software ecosystem, and workload needs.
- NVLink and Silicon Photonics: Copper used within racks; silicon photonics explored for scale-out networking.
- Silicon Photonics Adoption: Co-packaged technology aims to triple GPU numbers with existing power levels.
- GPU Count: NVIDIA currently supports 576 GPUs using copper.
In conclusion, NVIDIA’s focus on networking as a foundational element in AI data centers was evident throughout the summit. For a detailed understanding, refer to the full transcript below.
Full transcript - Rosenblatt’s 5th Annual Technology Summit - The Age of AI 2025:
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Good morning, everyone, and welcome to Rosenblatt Securities Fifth Annual Age of AI Scaling Tech Conference. My name is Kevin Cassidy. I’m one of the semiconductor analysts at Rosenblatt. And it’s my pleasure to introduce, Gilad Scheiner. He’s NVIDIA’s Senior VP of Networking.
Also we have Stuart Steckert. He is NVIDIA’s Senior Director of Investor Relations. So on NVIDIA, we have a buy rating and a $200 twelve month target price. And we’re bullish on NVIDIA not only because of their leadership in AI, but now their ability to expand into full rack scale deployments including scale up and scale out networks. So we’re fortunate to have Gilad speaking with us today.
Gilad is a networking expert. Gilad joined Mellanox in 2001 as a design engineer and has served as senior marketing management roles since 02/2005. You know, of course, NVIDIA acquired Mellanox in 2020 and, Gilad served also serves as chairman of the HPC AI Advisory Council organization, and he’s president of the UCF and CCIX consortiums and is a member of IBTA and a contributor to the PCI SIG, PCI X and PCI Express specifications. So Gulad also owns or holds multiple patents in the field of high speed networking. So with that, I’ll turn it over to Stuart to go over some of NVIDIA’s disclosures.
Stuart Steckert, Senior Director of Investor Relations, NVIDIA: Thanks, Kevin. Thanks everyone for having us. As a reminder, the content of this call may contain forward looking statements and investors are advised to read our reports filed with the SEC for information related to risks and uncertainties facing our business. So back over to you, Kepp.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Thanks, Stuart. Yeah. So I’ll kick off the fireside chat with a few questions and we’ll take questions from the audience also. And to ask a question, you click on the quote bubble in the graphic on the top right hand corner of your screen, and I’ll read the question to Gilad and Stuart. Keep in mind that this is a fireside chat working towards the understanding of NVIDIA’s network strategy.
Gilad will not be taking questions around financial guidance. So with that, thank you, Gulad, and great to see you again.
Gilad Scheiner, Senior VP of Networking, NVIDIA: Thank you very much, Kevin.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: So maybe we’ll start with a very high level question of what is the strategic importance of networking in AI data centers?
Gilad Scheiner, Senior VP of Networking, NVIDIA: Well, it’s a good it’s a good start. It’s a good start. So the data center the data center is the unit of computing today. You know, previously, it was an element or it was a CPU and or GPU. But but today, it’s it’s not the GPU.
It’s not the server. It’s the data center. Right? The data center is the unit of computing that we use. Now networking defines a data center.
The way that you connect those computing elements together will define what that data center can do. It could range from just building a server farm all the way to building an AI supercomputer that can run a single workload at large scale and and to do amazing stuff. So the the the networking or, you know, it’s used to call networking. I’m I’m not referring to networking anymore. It’s it’s more like this is the computing infrastructure.
Okay? It’s much more than a switch. It’s much more than a NIC. You know? It’s it’s a computing infrastructure.
And and that’s why it it has become so critical or so important. And that infrastructure will determine what kind of workloads you can do, what will be the efficiency of the data center, what will be your return on investment, you know, how many users, how many workloads you can you can you can bring in, how many tokens you can support, how many end users, you you can you can host on the data center. And and and this is where the the networking on infrastructure is so critical. Now when you go and design a networking for AI data centers, it’s completely different task than designing networking or infrastructure for the traditional hyperscale clouds. Here, we’re not talking about single server workloads.
We’re talking about distributed computing. We’re talking about workloads that need us to run on on over multiple compute engines, which could be hundreds and thousands and 10 of thousands and hundreds of thousands. So you need to make sure that every GPU here gets the right throughput. Every GPU needs to be fully synchronized. So the data that goes over the network needs to hit every GPU at the same time.
If you create SKUs on network, if you create what we call tail latency, then one GPU is gonna finish later than others. And and we all know that when you’re running an AI infrastructure, the last element to complete the task will determine the entire performance of the data center. So it’s the teletencies. This throughput is is the the latency across. It’s making sure there is a congestion control.
There is huge amount of elements that are in that infrastructure. That infrastructure will determine what you can do with the data center that you built. That’s why it’s so important.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Great. And when I talk to investors, they’ve been hearing the terms now that they’re just a little confused on, but maybe as you talk about connecting the entire data center and even data center to data center, but the terms of scale up and scale out networking, that’s new to some investors. So maybe if you could just give that explanation of what’s the difference and why are each important?
Gilad Scheiner, Senior VP of Networking, NVIDIA: Yeah. And I’ll try to make it maybe a little bit simple because I see that there is, you know, terms and people try to define what is scale up and what is scale out. we can we can start with examples. Okay? When when we design an AI supercomputer, our scale up infrastructure is NBLINK, and our scale out infrastructure could be InfiniBand or SPECTUMAX.
Okay? Those are the examples. Now what’s the difference between them? Scale up is your ability to build a larger compute engine. Okay?
So in a scale up infrastructure or connectivity, we’re taking those, GPU a six, let’s call it like that, or GPU packages. And we want those GPU packages to behave like one. And in order to build that one, you need to scale up infrastructure. That that’s what the scale up network does. It take those components, making sure that all of the balance between them, kind of the the right message rate, the right connectivity, the right elements are there in order to make those engines behave like one.
And this is why, you know, if if if we if you see Jensen Keynote, he says that his GPU is the rack. Right? This GPU is not the ASIC. It’s like, you know, we have NVMe 72. So that rack is the GPU, and and scale up network enables that.
Okay. So scale up enables to build g larger GPU out of the different AC components. Now once you define that larger GPU, now you need to connect those GPUs together. And and and how many GPUs you connect together, it depends of what kind of workloads you’re gonna run. Right?
What is is the mission that you want to achieve? Connecting those GPUs together in order to form multiple GPUs that will work together and run those larger missions, this is where the scale out network is needed. Okay? So there is a different requirements from a scale up infrastructure versus the scale out infrastructure. Once create a larger compute engine, and the other ones connect multiple compute engines in order to support the different missions that you want to run on the data center.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: So just as an example, you know, it was only last year that, NVIDIA was networking scale up eight GPUs, and this year, it’s 72 GPUs.
Gilad Scheiner, Senior VP of Networking, NVIDIA: Correct. And we talked about 576, right, in in in Kenote. Jensen talked about that as we’re moving forward. And it’s all it’s it’s all been determined, according to what is the workloads what is what are the workloads that you need to support. And, obviously, as as workloads continue to evolve and new workloads continue to emerge and you need to solve new things, then everything in a data center is being edit or changed, right, or progress.
So one of the example is that that unit of computing that was maybe a single GPU then then becomes eight GPUs on NVLink. And now it’s only two GPUs and it’s going go to five seventy six. It’s all in order to support what kind of workloads you to run today or you need to provide today.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: And maybe you touched on it, just the workloads, what is happening with the AI workloads and applications that are influencing some of the network requirements?
Gilad Scheiner, Senior VP of Networking, NVIDIA: Yeah. So what you actually do is you build a data center, right? And that data center is aimed to serve workloads that you you define or the workloads that you want you want to run, on the data center. So essentially, everything is needs to be connected together. Right?
And there is different elements that you can look at. is building or or co designing the network with the compute, for example. Right? And this is important because you design a data center, you don’t design components. And and I’ll give you two examples of what does it mean co design.
One example is that, in the traditional world, let’s say, there was compute engines that during compute, and then there was networking elements that were tasked to move data. Right? That was the separation between them. But when you design a data center to serve the AI workloads, And you have the ability to decide where where to put what. And now there is no boundaries.
And for example, we took compute engines kind of traditionally run on compute components. We took compute algorithms and running them and we’re running them on the net. For example, what we call Sharp in InfiniBand is taking is doing data analysis on the data on the network. So the network is not just moving data. It’s actually participating in the compute cycles.
Right? And why we’re doing that? Because once you do the reduction operations, for example, on the network, you can save half of the bandwidth that you need to run, and you can com complete things much faster. Okay? So this is an example where you move things from compute to the network.
On the other side, traditional network topology was built in a concept of a top of rack switch, which means that all of the NICs will go to the switch on top of the rack, and then you will connect those switches together. This is the wrong thing to do if you build an AI data center. Because you you mentioned, for example, you know, bridge generation with eight GPUs connected on NVLink. So those eight GPUs already communicate between themselves on NVLink. They don’t want really to continue and and talk with them with themselves again.
So why would you put all of those GPUs on the top of rack switch? It doesn’t make sense. You want to spread that connectivity and have every GPU connect to other GPUs in a fabric, and this is where we created a multi rail topology. Okay? So now the network is designed the way that the compute is running, the way that the the compute algorithms are running.
And then we’re taking some of the compute algorithm, make sure you’re running on the network because it’s much more efficient to do it there. Okay? So so this is one one element of AI workloads required to actually design a full data center, design that unit of computing, and then you want to do that in a full synergy, in a full in a full code design. Okay? That’s one thing.
The other thing, of course, is that AI frameworks continue to evolve. Right? That’s why every year, we have a new compute engine that’s coming out. There is new network infrastructure or or computing infrastructure coming out. There is a new GPU.
There is new NICs. There is new switches in order to serve scale. Okay? Because we see increase in scale. You have you’re moving from you know, thousands of GPUs to tens of thousands.
Year after, you go to hundreds of thousands of GPUs. People are talking about now the million scale GPUs. You need to actually be able to grow that element. You have so many routes. Just think about it.
With all those GPUs, every GPU communicate on the network, there is so many routes that you meet need to make sure that you send them in the right direction, and no one’s gonna, collision with another one. So there is no congestion, on the network. So there there is so much complexity in that network, and that’s why you see that there is every year, there is new generation, new capabilities, new element that are bring brought in into the compute infrastructure to support the full data center design and to support the different kind of workloads that we say that we say.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Okay. You’re connecting these hundreds of thousands now of GPUs, but maybe if I even roll it back a little bit with the Mellanox acquisition. At Mellanox, had both Ethernet and InfiniBand. I guess what’s the difference in the two offerings? Is there scale out?
And if we go into depth and talking about the scale out networks, how do you decide whether InfiniBand is the right solution or Ethernet’s the right solution?
Gilad Scheiner, Senior VP of Networking, NVIDIA: Yeah. Actually, we, you know, we we let we let our customer decide what makes sense for them. And and maybe I’ll start a little bit in the history. Yeah. Mellanox Mellanox did start with InfiniBand.
And InfiniBand was built for distributed computing workloads. K? It was built in a sense that it is lossless. Unlike traditional network, that lossy was fine. Meaning, traditional network works in a way that if there is collision on the network, you don’t try to solve that collision.
You just throw packets because it’s okay. I can retransmit the data. But when you deal with distributed computing applications, if you drop data, you’re gonna retransmit the data, but you don’t just retransmit the data to a single GPU says for example, the fact that you retransmit the data to a single GPU and that GPU become late in in the whole scheme of the workload, now everyone else is waiting. Okay? So you cannot you don’t want to retransmit data.
You don’t want to drop data. So InfiniBand start as a lossless network. You don’t want to drop data. You don’t want to create the latency. And then InfiniBand, in order to do that, brought congestion controls and adaptive routing elements later on and so And and it was great for scientific computing, great for HPC, and, essentially, it’s great for AI because AI is distributed computing.
And today, InfiniBand is still the gold standard for AI. You know, everyone everyone that builds a network always compare its network to Infiniband. Even even when we did SPEC2MAX kind of creating Ethereum for AI, we compare that to Infiniband. That’s the gold star. It is the gold star.
It brings element that no other network exists, and it’s a great solution. So if you if you build an AI factory, you know, single job running large scale, You know, InfiniBand, there is nothing nothing nothing better than InfiniBand. Okay? It’s the gold standard. Now NVIDIA also brought Ethernet.
Right? We designed Ethernet for AI. And you can ask, you know, if if you have InfiniBand and InfiniBand is so great, why why did you guys brought Spectromax? And the reason for that is that we believe that AI is gonna go everywhere. Okay?
Every data center will run AI. And therefore, there there will be AI clouds, multi tenancy, multi workloads, multi users. There will be AI in enterprise. Right? I’m talking about enterprise AI.
We see a lot of enterprise now adopting AI. Those areas are being built by people that are familiar with Ethernet for many, many years. They build their software stocks. They build the management tools all on Ethernet. Okay?
And if they continue running with Ethernet and keep their management and keep how they support their enterprise company and so that would be much better for them. AI is evolving so fast. And and and them for start learning how to handle InfiniBand, for example, manage InfiniBand, meaning they’re gonna lose the train. Right? So so we wanted to help them.
We knew that’s gonna go to everywhere. And everywhere means that we want to bring Ethernet to AI. Okay? We want to enable Ethernet as an option for AI. And for people that build AI data centers and they are familiar with Ethernet, their software depends on the software ecosystem of Ethernet.
Okay? All the tools that were created, their own management infrastructure that is there and was built over the years and progress over the years and run on Ethernet. We don’t want them to create recreate it again. Okay? So for them, Ethernet is a great thing.
And this is where we build SPEC2MX. Now one important thing is to to to kind of know of what we did in SpectromX. Well, SpectromX is the generation of Iterant for because nothing in EtherNet fits AI before Spectrum X came. Spectrum X is actually not the generation. And the reason is that what we did is that we brought things from InfiniBand, from the multi generations of InfiniBand that continue to evolve over the years.
We brought those element to Internet. Okay. So that’s why Spectrum X on one side, it’s kind of the Internet for AI. But what we brought inside has years of development on InfiniBand side. So that’s why it’s came in very mature, very quickly, and actually completely aimed to solve the problems of AI on Ethernet.
And for example, we brought lossless to Ethernet because we don’t want to draw packets. Right? So we brought lossless. We brought adaptive routing capabilities. So you have lots of flows between GPUs.
You want to make sure that every flow will go in the best available path. Right? It’s like solving the routes in a sense. We brought the congestion control to Ethernet. K.
No collisions. You want to make sure no collisions, you know, that application one application cannot impact another application by creating collisions on the network. We brought many things from InfiniBand into Spectrum X and actually created Ethernet for AI. And now you have Infinibet, which is a gold standard. And if you’re running, you know, building supercomputer in single job, you know, if you if you know how to manage Infinibet, use Infinibet in the past, you know, there is nothing better than that.
If you’re running Ethernet, you can keep running Ethernet. We brought the best Ethernet for AI with Spectrum X. And and a and a good good example for that is that Spectrum X is running more than 100 of thousands of GPUs, more than 100,000 GPUs in a single data center for a single workload. There there is no other Ethernet technology that they managed to achieve what Spectrum X did. And the reason is Spectrum X is built for AI.
So we have a great Spectrum X. There is great InfiniBand with Quantum. And now people can choose what makes sense for them based on their workload, what they need to serve, what they’re building, what’s their familiarity, what is their software ecosystem and so
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: So that’s just a mix. You can put those features into Ethernet and that’s more of a physical layer features that you’re doing. And so it doesn’t affect the customer’s Ethernet that they have been running for years.
Gilad Scheiner, Senior VP of Networking, NVIDIA: So it’s not in the physical, it’s equal. It’s a combination. It’s the physical, it’s the link layer, the transport level. It’s the way that the NIC runs with the switches. Okay?
One of the things that made InfiniBand so great is that it’s it’s a platform. Okay? It’s it’s not separate NIC and separate switch. It’s like it works together. You know, the NIC get information from the switch network in order to determine the flow of data.
The switch the switch element, knows how everything in in the data center behaves. Okay? It’s it’s like you need to know not just your own status when you do routing on the network. You want to know the status of your neighbors, and the neighbors could put the NICs on the switches. Because if if my neighbor’s switch has some issues, for example, I don’t want to send continue sending data to the same area.
So there is a global load balance and it happens. And the next work in conjunction with the switch is a full end to end. Okay? So so this covers everything from PHY to link to transport. Now on top of that, you have all the management stack, and you have all the cloud management tools, for example, and the hosting multiple tenants and stuff like that.
That runs on top of that infrastructure, not on the network. Okay? So what we brought into SpectrumX cover all the infrastructure limit, everything that runs on top of that could be the same. And this is where it it goes easily into people that build Ethernet or or data centers and build their software ecosystem. Now it’s actually goes directly there and it bring them the the elements of the infrastructure that are needed for running AI training or AI inferencing.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Okay. So your Spectrum X could be mixed in with standard Ethernet? Some other racks could be running standard Ethernet?
Gilad Scheiner, Senior VP of Networking, NVIDIA: So Spectrum X is Ethernet. So it’s interoperable with any other Ethernet devices. Yeah. So if you build, for example, an AI data center, you you build a unit of computing. Right?
So that means that Spectrum X will be the scale of infrastructure, for example, and it’s covered the full stuff. Now that data center can be connected to other parts of your infrastructure. Right? It can connect to storage. It can connect to another data center.
It can connect to users or to their desktops and stuff like that. And this is where you might see other kinds of Ethernet, right? Connecting to desktop, traditionally, it is great. And of course, you can connect that traditional Ethernet into that data center that has Spectrum X for the scale out infrastructure.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Okay. Great. And maybe if we could touch on too, you had mentioned the NIC cards and your DPU or the BlueField. Can you talk about the importance of that, of having control over the DPU within that same network?
Gilad Scheiner, Senior VP of Networking, NVIDIA: Yeah. So so the the DPU actually bring another element of that infrastructure. And so when you when you build a data center, there is no one network. You know, traditional world, there was one network. Okay?
If you go to hyperscale clouds, there is one network. You build it an AI supercomputer, different story. And and you mentioned that already because, you you know, you asked, you know, there’s scale up and scale out. So so here are two networks. Right?
There’s at least two networks scale up and and scale out. Now there is also access network, meaning users need to access the data center. That’s a network. Now there is also storage access that might be even a network. Okay?
So there there is multiple elements in that AI data center, and there is different components to each. So if if we look on NVIDIA AI data center, We use NVLink for scale out. We use SpectrumX for scale out or InfiniBand. And that scale out includes the switch and include what we call super NIC. Okay?
And that super NIC has compute element inside of it in order to determine that injection rates and process telemetry from data from the network and so And then we have the DPU on the access network. Because what the DPU enables to do is to move the data center operating system from the from the server compute engine into something else. And that’s greatly help with security. Okay. So if if you build a a data center and your hypervisor, for example, your hypervisor can run on the same CPU that hosts the user applications.
You have a security threat because the user can get access to the hypervisor, and now you can control the entire data center. Okay. So in order in order to make it much better, you want to separate the infrastructure domain from the application domain. The the CPU will host the users on the system, but you’re gonna run the hypervisor, for example, or other element of the infrastructure working system on a different element. Let’s say com completely separate from those application where the application running.
This is where the DPU plays role. So we’re running the DPU is being used in order to run the data center operating system, to provision the servers, to do the the secure access to the user that’s coming into the data center and so So DPU is the north what we call north south kind of the access network. And then Supernic and the switches or spectrum and are part of the scale out infrastructure, kind of the some people call it back end network or some of them call it compute network or compute infrastructure. And then you have also the scale up where there’s another element of NVLink.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Okay, great. So the DPU also it’s kind of freeing up the CPU for doing cycles of what it’s good at and the DPU. So it’s yeah, that’s maybe if we switch over and talk a little bit about the scale up network, the NB LINK. You know, you have NB LINK and NB Switch. You know, what there’s other topologies out there.
You know, what I guess what’s the advantage of NVLink over there’s UA link and even Broadcom and their earnings call have been talking about just using Ethernet as the scale up. Can you kind of give the gives and takes of each one?
Gilad Scheiner, Senior VP of Networking, NVIDIA: Well, I I can definitely I can definitely talk about what we do. So you know, scale up is is is not easy to do. It’s it’s very not easy to do. It needs to take those, you know, GPU ASICs and make them one. Okay.
It needs to form, like, one unit to fill out the ASICs together. And and therefore, it’s not just the the huge amount of bandwidth that need to run between them. It’s you need to have a very high message rate to that everyone will all ASICs will connect and communicate together as, like, one unit. Okay? You need to have a very low latency between them.
It’s it’s a very tight network. And because of that, are trying to put everything in a rack. Okay? So so we can use, for example, copper for that connectivity. Because copper, it’s consumed zero power.
And because of the huge amount of bandwidth, if you would do it on something else, there’s gonna be a good amount of power being consumed there. We want to make it very resilient and so So so we we want to maximize copper. That’s why we want to put everything in a rack in closed rack. And this is where density becomes an interesting element to deal with. And this is where we bring liquid computing into the game because we want to pack everything to increase so you can maximize copper and build, like, your one unit.
Okay? So so there is a good amount of of complexity in actually building building the NVLink element. Now one thing that is, you know, it’s obvious that the NVLink brings, it’s it’s it’s it’s working. Okay? NVLink, it’s in the generation.
And and, essentially, what what made InfiniBand so great is because it had many generations in it. Right? It continued to evolve and it continued to be better and better and better. And that’s what makes Spectrum Max so great because we took all the twenty five years of InfiniBand and put it on Internet. Okay?
So, you know, putting putting an idea of a network, it says, okay. You know, one the shot, you know, my shot is gonna make it so great. In reality, it’s not the case. Okay? So so this is, a complicated element.
There is, a huge amount. You know? Just think about NVMe x 72 is, like, 130 terabytes of per okay, in a single red. It’s like the entire peak Internet traffic is just running in a single rack. Okay.
This is what you need to support in that sense. So it’s few generations. It continued to evolve over over the years connecting more and more GPUs. We brought we brought Sharp into NVLink. Okay.
Actually, is compute engines. There is compute algorithms running on that NVLink when you’re running everything together. So this is kind of NVLink. I tried to give a little bit on the complexity of it. And and obviously, you know, having the generation, it’s it’s just show that you evolve from GPU to GPU and you bring more elements, more capability.
You need to adjust to the workloads. Okay. I’m not sure that I mentioned it, but the reason that we’re annual cadence on the infrastructure, not just on the compute, is because the element that you need to bring into the infrastructure, including those data algorithm that are being added from generation to generation because the workloads are different, because the workloads are being modified. And as the workloads been modified, the compute algorithms need to be modified. And that’s impact what you put on the infrastructure, which include NVLink and the rest.
So this is where this is where the cadence and being robust, you know, it’s working. It’s amazing technology. It’s liquid coal. It’s the dense, fully copper, and that’s what make NVLink NVLink.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Great. And you also announced NVLink Fusion. So you opened it up that it’s not a closed network. Can you talk about the advantages of NVLink Fusion?
Gilad Scheiner, Senior VP of Networking, NVIDIA: Yeah. So so once once you get once you get once you get the sense of how complicated scale up is, and and people may say, no. It’s easy. No. It’s not easy.
Okay. There is huge amount of complexity in it. Then why not to help our customers that wants to build their own custom accelerators, for example, leverage what we invested for years building the best scale up infrastructure with the liquid cooling, with the density, with all the aspect of that and all the performance of that, why why wouldn’t we let our customers leverage that huge amount of investments and make them the easier for them to take those accelerators that they build those custom executed, they build custom accelerators, actually leverage our infrastructure to build a solution for them. Okay. We we design we design a data center.
We design it as a whole. And then you can take pieces off it. And you can take the GPU. You can take the CPU. You can take them both together.
Can also take the infrastructure if you want to. Okay? So this is where we build or working with ecosystem, which includes Mediatek and Marvell and Alchip Technologies and and and Astrolabs, for example, and CPU suppliers like Fujitsu and Qualcomm and working with them so they can leverage what we do. But infrastructure, you start you start we we started this this talk with saying, you know, the infrastructure becomes a key element. And, essentially, by having engine infusion, we enable that key element to be used by people that needs or wants or or requires to build their own accelerators, and now they can leverage what we did, what we designed, and and to actually get a great data center for their own custom elements.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: And maybe just to understand that is that FusionLink, like you mentioned Qualcomm, just use them as an example of their CPU wants to connect. Do they pay a license for the, NVLink or do they just start using your NV switch?
Gilad Scheiner, Senior VP of Networking, NVIDIA: I think that there is element of of NVLink that they will need to connect to. So essentially, need to get the interfaces and they need to get, for example, an NVLink dial it that the chip the CPU can connect to it. And once they have that, they connect to the NVLink switch. So they can acquire the NVLink switch, and they can acquire the entire elements that also come there with the liquid cooling, all this stuff. So so they they are taking elements from us.
They’re taking the API from us, and, obviously, we work with them and they can build their own system.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Okay. Tyson, we are getting lots of questions from the audience, but people are introduced the silicon photonic switches at GTC. People are asking NVLink, when does that go fiber? And also when we talk about scale out, is that already fiber? What or what happens with silicon photonics tied into all of this?
Gilad Scheiner, Senior VP of Networking, NVIDIA: Yeah. So different different different elements here. So on on the scale up, let’s put it like that. Copper is the best connectivity. Copper is the best connectivity.
Zero power. It doesn’t consume power. It’s very reliable. Okay? And and and it’s it’s very cost effective.
So you would like to use copper as much as you can, as much as you can for any connectivity that you can. And therefore, we’re trying to put as much as compute density in the rack because within that rack, we can use copper. And that’s why we’re investing a lot, right, to increase the the amount of computing direct so we can use copper and run that because there is nothing better than copper. Now when you go to the scale out, this is where you talk about distances. Right?
Because now we have racks that needs to be connected, and you’re out of the reach of copper. And you need to go and use optics, and you need to use optical connections. Now in in traditional data centers, the amount of connectivity between rack was very, very between racks was very, very small. Okay? So there is no much optic transceivers or optical connections that were there.
When we look on on an AI factory, every GPU has a NIC out. Right? So if if we look on Blackwall, every Blackwall is an 800 gig NIC that goes out. So the scale out infrastructure, actually, there is a good amount of optic connectivity. We need to use around six transceivers for every GPU.
So if if you build a 100,000 GPU data center, it’s like 600,000 transceivers. And and now the power that associate with the optical network becomes something that can consume up to, like, 10% of compute. So, you know, if I’m if I’m building 100,000 GPUs and I can add another 10,000, that’s not a small number. Okay? So so now the power becomes something that you want to look how to improve it.
And we all know that the limiting element in building data center is power. Right? It’s not really space. It’s actually power. So as much as you can save power and you can redirect it to compute engines, that’s a great thing to do.
The thing is that data center increases in size, and and it’s go fast. Right? It’s like, you know, like, two weeks ago, you know, we talked about, like, 16,000 GPUs. You know, they’re large data centers, and now you’re talking about hundreds of thousands of GPUs. So a 100,000 GPUs, 600,000 transceivers, and it takes time to install it, and it takes time to manage that.
And you might need to replace elements. There is so many components that you need to deal with. K. So this is this is kind of this is the right time for for improving optical network for the scale out. And the way to improve that is to introduce co packaged silicon photonics.
Right? And and co packaged silicon photonics, what what that means, it means that instead of having the the optical engines in every transceiver, I will take those optical engines and put that next to the switch and package it together with the switch. Now what did I do here? I I reduced distances. Okay.
So if if the optical engine’s in the transceiver, it needs to go the distance through the transceiver, the cage, this the the the PCP, the substrate to go to the switch. I reduce the distance, and with that distance, I reduce the power. So so now on the same ISO power on the same ISO power, I can put three x more GPUs. With the same ISO power of the network, I can connect three x more GPUs. That’s huge.
Now I’m reducing transceivers. Okay? Now now I have one transceiver per GPU, not six. Think about how many elements you reduce from the data center, which means it’s not just I increase the resilience of the data center because now there is less elements. I also reduce the time to operation.
I can build I can build the data center much faster. Okay. So so CPU brings such a greatness element, and and we started with the scale out because, again, it’s like 10% of compute power. I can increase that number. It’s huge.
Reduce number of components. There is a huge amount of benefits bringing co package optics into the scale up infrastructure. Now on a scale up, as as as as as I can use copper, I’m gonna continue with copper. Okay? So we increase the density with copper because there is nothing better than copper.
As long as you can use copper, we use copper. Okay? So this is where we continue with copper. We we announced that we’re having 576 GPUs on copper And we scale out its multi racks, distance optics. This is where co package optics would be a great thing.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Do you have an idea if we can get to five seventy six with copper? When do we have to cut over to optics?
Gilad Scheiner, Senior VP of Networking, NVIDIA: It’s a it’s a good it’s a good question. You know, over the years, there was always, you know, people saying, oh, this is gonna be the last generation of that.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Yeah.
Gilad Scheiner, Senior VP of Networking, NVIDIA: Yeah. It’s not gonna you know, it will be the last generation of that. Right? It’s like and every time when people say it’s gonna be the last generation, apparently, there is another one. So as long as we can pack, we’ll pack.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Okay. Great. I’ll see if I can answer. Let’s see if there’s a question we haven’t covered yet. If you can answer this, you’re winning in the market, are you displacing what Marvell and Broadcom solutions are or add solution providers like Coherent?
You know, are you replacing their their designs?
Gilad Scheiner, Senior VP of Networking, NVIDIA: answer is is is is not really. Okay? The reason is the following. there is many infrastructure in the data center. And and there is many areas that requires a need to use transceivers.
Okay? So on a scale out infrastructure, we’re going to reduce co package optics. North South network, for example, require transceivers. We we put transceivers on NIC and so And since the data centers are growing and the the market is growing, there is enough for everyone. And therefore, we’re not replacing anything, but there is different infrastructure and there’s infrastructure areas that require transceivers.
That’s one thing. The thing, we are working with that ecosystem of partners and they are part of our CPO infrastructure. So they are contributing into what we’re doing on CPO. And they’re bringing our elements, and we’re working with ecosystem. For example, you know, we announced working with TSMC on packaging, but we’re working with a lot of vendors that you mentioned on lasers and and and optical arrays and the different elements that we need for connectivity.
So they’re they’re they’re contributing to our CPO infrastructure as well. They have more or good amount of transceivers to continue and support. And, you know, data center is growing. AI is going everywhere. There is enough for everyone.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Great. You know, I think we’re out of time here now. So I would say in summary, you’ve got the scale up and scale out networks and you’re tying together hundreds of thousands of GPUs that act as one big GPU. And if people want to come into the NVIDIA network, they’re open to do that. You think you’ve got the right solutions.
So maybe if you want to give a closing remark also.
Gilad Scheiner, Senior VP of Networking, NVIDIA: Yeah. I think you know, you you had short questions and I had long answers, and and sorry for sorry for that. You know, the the in in the past, people data center budget was, you know, focus on on being a less buy as many servers as we can. And then we if something left, we may connect them to get there. If something left after that, we may do some storage and stuff like that.
I think now now folks realize that the infrastructure is key. Okay? It’s not just network elements, you know, buying a NIC and buying a switch. No. You’re buying you’re buying a spaceship.
Okay? You’re buying a supercomputer. You know? You’re buying you’re buying something that requires the kind of to be fully synchronized with the data center, and that infrastructure will determine what data center will do. K?
That infrastructure will determine if those compute engines are just a cell phone or that’s a AI supercomputer for training conferencing. Okay. So it’s it’s a key element. Its importance will continue to increase, and we’ll see innovative technologies coming into the infrastructure. So it’s it’s something that keep us keep us keep us exciting keep us exciting.
Yeah. So so this is where the infrastructure is. I think more people are interested in learning about that, and I’m happy that we were able to talk today. And I hope that we provided people with more better understanding about the infrastructure that we built.
Kevin Cassidy, Semiconductor Analyst, Rosenblatt Securities: Yeah, that’s great. Thank you. Thanks, Stuart. Thank you, Gilad. Thank you very much.
Stuart Steckert, Senior Director of Investor Relations, NVIDIA: Thanks, Kevin. Thanks, everyone.
This article was generated with the support of AI and reviewed by an editor. For more information see our T&C.