GUUG mal, Beitrag 002 (Januar 2000)


Vorheriger BeitragBeitrag abspielenNächster Beitrag

The following text is a transcript of the first minutes of a talk about the upcoming 2.4 release of the Linux kernel. Alan Cox gave the talk at the Linuxtag in Bremen. He kindly provided his slides for this CD-ROM. Because of the darkness during the talk, the video did not get very sharp and so it is audio only. As an experiment, the second track on this CD-ROM also contains the talk (beside the usal mpeg-file). I apologize for any errors I made while writing down Alans talk.


Alan Cox about Linux 2.4

This is intended to be a very simple level talk about what is going on in Linux. What we hope to get into Linux 2.4. I try to speak slowly enough so everybody can follow me. I apologize in advance for the accent, but I was born with it.

The first slide shows three big areas we are aiming for Linux, to bring up to 2.4 and things people need now. The biggest one is what I call upwards scalability - that is the marketing term. In real language it means running it on bigger, faster and better machines. Buisness people want this, because there is a lot interest in Linux as a server. Things like DB2, Oracle, SAP need this kind of big applications, which demand big machines. Developers are very interested, because then people send them big machines to develop on.

The second thing people are working on is support for more platforms, more architectures. In part this is small things like more generations of Alpha machines being supported. There are bigger things that are going in with the port of the ARM processor. Power PC used to be on this list. The Power PC is now been merged with 2.2, so there are less things to go in. We have been backmerging several things this way over the time into 2.2. But you also see more Sparc64 support, in time possibly PA Risc support. Although that could be something after 2.4 is out. And hopefully other platforms along the way. There are people working on embedded platforms, saving windows ce machines, for example.

The third part which is going in is to get Linux to work on small machines. Linux on the palm pilot is really more a hack than a useful system. But there are people trying to get Linux on machines the [?] five, the [?]. They have very tight resource limits, so there are several challenges to address here.

This is the slide about large processor machines. If you like the box on the right side, by the way, that is what VAs standard Linux boxes. They are already selling machines of that size. So in fact we have already a problem in scaling to four processor machines because they are sold and people want to use them effectively under Linux. They work, but they could work a lot faster. For 2.2, depending on who you talk to, we scale between two and four processors. That means you are getting realistic improvements about that many CPUs. When you go beyond four and the less your job is very compute intensive, with 2.2 you won't gain significant performance advantages. For 2.4, we've got to change that. We get eight or sixteen way PCs nearly on the market. Several chip vendors I've talked to at the microprocessor forum are putting four CPUs on one chip. So we will get to see even more multiprocessor machines at even lower prices. And the UltraSparc 10000 already exists, partly ready runs Linux and you can put 64 processors in one. So we have some real scaling problems for machines of that kind of size. At the moment, probably nobody can afford a UltraSparc 10000, but given the rate at which computer prices change within two years, machines of that power are likely to be readily accessibly to ordinary users. And within two years, that is a realistic life-time of a kernel. So we have to address this things right well in advance as good as possible.

Things you have to do to make a multiprocessor machine go faster. The biggest one is what people call fine-grained locking. With the original Linux kernel for multiprocessing, the 2.0 kernel, there is a single lock and only one processor could be in the kernel at any time. It is a very good way of making a kernel multiprocessor without doing a lot of the really hard work. It is not ideal, but it gets you started, gets you basic support. But with a lot of processes wanting to be in the kernel, your performance goes completely down the drain. With a finer-grained locking system, you have multiple locks within the kernel, so you have multiple processes doing things in the kernel and don't clash. In 2.2, things like interupt handling are in parallel with other kernel operations, so network operations and I/O operations are. The core of the system [?] and the caches were still basically one process at a time only. And that gives you I/O limitations [?] which are there in 2.2 at the moment. The network layer - similar problem. On a three processor machine, you would like to be receiving pakets and interupt handling on one processor, doing the network stack receiving in the second processor and writing data back to the network on the third processor, maybe with an application on the fourth. At the moment we can do very well when writing data to the network but with receiving data on a multiprocessor machine, a lot of the incoming information to [?]. Even with Linux 2.2, only one or two processors will get involved in this. So people like Dave Miller and Alexis [?] are doing fine-grained locking on the networking layer. And we already see significant performace advantages on machines doing large jobs.

The scheduler is important. The scheduler is responsible for handling out jobs to each processor. On a multi processor machine, the scheduler also has to deal with not moving processes between processors. And lots of scheduling issues, one of which is called the thundering herd problem. When you have a big machine, you have many processes waiting for resource, with the traditional scheduler like in Linux 2.2, the resource becomes available and you wake up every process, which wants the resource. So imagine you have a large Apache server, when you have 200 Apache processes running. In some cases you might wake up 200 processes, one of which gets the resource and the other 199 will run a bit and go back to sleep. We don't want to have that kind of inefficiency in the system. So the scheduler has to be improved. Most work there has been done and we support the semantical wake_one. This is, when you give up a resource, you just wake one process and it gets the resource and it is handed on. It [?] to do this, but people like Ingo have done the work.

Big machines... I'm sure everybody here remembers the time when eight megabytes were a lot of memory. Nowadays, you can't buy eight megabytes of memory. That scale of memory growth is likely to continue. So while we are looking at four gig machines - most people thinking I wish - one gig SIMMs are likely to be around fairly soon and in time, the one gig SIMM may be the standard what you get in PCs. So we need to support large amounts of memory. The other reason we have to support large amounts of memory [?] large databases and similar applications are used with Linux. And database vendors seem to believe that everybody should have as much memory as possible. And they like to allocate all of it. At the moment with Linux 2.2, we support 2 Gigs of memory. There are some patches done by Siemens, which allow just to get 4 gigabytes of memory. It is not yet part of the standard kernel, but I think most vendors ship them. They are stable, but the patches are a new feature not really a candidate for a standard kernel. For 2.4, four gigabytes works reliably. And further work was done by people like Ingo and in theory we support 64 gigabytes of main memory in a PC. We hopefully eventually will support this in other architectures. But there has a certain kind of work to be done for this. One problem here is, that on the PCI bus, most cards can only address 4 gigabytes of memory. So the old problem with ISA bus cards only being able to access the low parts of memory, has been replicated for the low four gigabytes of memory. History has repeated itself in inconvenient ways.

The 64 Gig support requires CPUs with very specific extensions. In the Intel market, AMD and Intel ship support for this. Its the AMD Athlon and the high end Pentium II/III machines, based on the Xeon chips have the extensions for very large amounts of memory. So we want to support it. The other big CPU features, which came in and mostly weren't noticed, were MMX and the streaming instructions. That is what Intel were calling KNI at least until the marketing people picked a real name for them. We want to support these, because especially games and music can make good use of them. And for some kernel functions, like copying large amounts of memory, we can also make use of these facilities within the kernel. At the moment the kernel tree just has AMD Athlon patches in it, the patches in the queue include support for the Pentium III with the streaming instructions. And I have some alpha test patches from someone who felt he want his K6 to go a little bit faster. And he did vast amounts of work for two or three percent increase. The work has been done, so we will use it.